Thu, Dec 22, 2005

MARC Transliteration

I got some great suggestions from <a href="http://blog.iandavis.com/2005/12/transliteration-or-interpretation">my earlier posting</a> which really shows the value of writing about stuff earlier rather than later.

one comment in particular from Harry Chen got me started on a particular track:

Here is what I typically do. I first map the syntactic format of the data into RDF, and then build external inference rules to produce a more expressive semantic model of the original data.

This seemed like good advice since it gets the data into the realm of declarative rules very quickly. The alternative is to build lots of transformation logic in code which, while efficient, limits the audience and reach of the technique. So I thought I'd try and transliterate a MARC record to some RDF representation of its structure. The trick is to get the RDF description to be expressive enough to to be useful to a rules engine. I also wanted a representation that would allow round-tripping of MARC records.

My starting point was a MARC record from the British Library. I can’t include it here because it contains control characters for separating fields and subfields, namely ASCII 0x1d, 0x1e and 0x1f. However here’s a common human-readable representation:

=LDR  00651nam a2200193u  4500
=001  004148830
=003  Uk
=005  20040502102300.0
=008  040422m19539999|||\||0||eng|
=015  $a2672340153
=019  s$aoagh06952
=040  $aUk$cUk
=082  04$a942.31
=245  02$aA History of Wiltshire. Edited by R. B. Pugh and E. Crittall. [With maps and illustrations.].
=260  $bOxford University Press for the Institute of Historical Research: London, 1957, 53- . fol.
=440

Permalink: http://blog.iandavis.com/2005/12/marc-transliteration/

Other posts tagged as libraries, marc, rdf

Internet Alchemy

MARC Transliteration

Earlier Posts