I got some great suggestions from <a href="http://blog.iandavis.com/2005/12/transliteration-or-interpretation">my earlier posting</a> which really shows the value of writing about stuff earlier rather than later.
one comment in particular from Harry Chen got me started on a particular track:
Here is what I typically do. I first map the syntactic format of the data into RDF, and then build external inference rules to produce a more expressive semantic model of the original data.This seemed like good advice since it gets the data into the realm of declarative rules very quickly. The alternative is to build lots of transformation logic in code which, while efficient, limits the audience and reach of the technique. So I thought I’d try and transliterate a MARC record to some RDF representation of its structure. The trick is to get the RDF description to be expressive enough to to be useful to a rules engine. I also wanted a representation that would allow round-tripping of MARC records.
My starting point was a MARC record from the British Library. I can’t include it here because it contains control characters for separating fields and subfields, namely ASCII 0x1d, 0x1e and 0x1f. However here’s a common human-readable representation:
=LDR 00651nam a2200193u 4500 =001 004148830 =003 Uk =005 20040502102300.0 =008 040422m19539999|||||0||eng| =015 $a2672340153 =019 s$aoagh06952 =040 $aUk$cUk =082 04$a942.31 =245 02$aA History of Wiltshire. Edited by R. B. Pugh and E. Crittall. [With maps and illustrations.]. =260 $bOxford University Press for the Institute of Historical Research: London, 1957, 53- . fol. =440