Introducing Embedded RDF
This is a copy of a message I sent to the W3C semantic web discussion list today.
I've been working on an enhanced way to embed RDF into XHTML that doesn't require any new markup. There are existing methods such as the recommendations for expressing Dublin Core in HTML[1] but these are limited in scope and expressivity.
I've also spent quite a lot of time studying the underlying principles espoused by the Microformats community[2] who are reusing the semantics of XHTML to express commonly used data structures such as contact details or event descriptions. Some of the most important principles are:
- Visible Metadata - by making metadata visible consumers can easily form an opinion on whether to trust the author. Metadata hidden away in meta tags is easily abused for search engine placement or other gain since most visitors don't inspect the source code of the page. This principle also helps keep the metadata relevant. Hidden metadata is easily forgotten and can easily go stale whereas if it were visible to humans incaccuracies would soon be discovered and fixed.
- DRY (don't repeat yourself) - very often we maintain separate RDF documents with HTML equivilents. Unless these are automatically generated it's very easy for them to get out of synch. This principle suggests that the metadata should be expressed only once whether it's for humans or machines
- Reuse Not Reinvention - if we reuse existing formats then we immediately gain the benefit of being able to use existing tools to generate and consume the metadata. We also hook into the experience and knowledge of the thousands of people who have invested time and money getting to grips with existing technologies.
I've taken these principles on board in my design. Embedded RDF uses XHTML attributes such as 'rel', 'rev', 'class','href','src' and 'id' to embed RDF triples into an XHTML document. Triples can have a subject of the embedding document or of a fragment within that document. It's possible to use the 'rev' attribute to embed triples about other resources but the object of the triple must be the embedding document or a fragment within it.
For example, the following XHTML:
<div><address class="dc-creator">Ian Davis</address> wrote this</div>
embeds the triple:
<> dc:creator "Ian Davis" .
Schema prefixes are declared using link elements in the head of the document, exactly as defined in the Dublin Core specification:
<link rel="schema.dc" href="http://purl.org/dc/elements/1.1/" />
The 'id' attribute is used to denote a seperate resource:
<p id="ian"><span class="foaf-name">Ian Davis</span> wrote this</p>
This generates a triple like this:
<#ian> foaf:name "Ian Davis" .
Anchor elements are used to refer to other resources:
<p id="ian"> my home page </p>
embeds the following triple:
<#ian> foaf:homepage <http://example.org/home> .
I hope this has given you a flavour of how Embedded RDF works. There is comprehensive documentation on our wiki here:
http://research.talis.com/2005/erdf/wiki/Main/RdfInHtml
I'm also working on some examples. An example FOAF in XHTML document is here:
http://research.talis.com/2005/erdf/foaf-in-html.html
And a sample DOAP in XHTML document:
http://research.talis.com/2005/erdf/doap-in-html.html
There is also an RDF extraction service that scans an XHTML document for Embedded RDF and generates RDF/XML from it
http://research.talis.com/2005/erdf/extract
This work is by no means complete, but I'm soliciting early feedback on the approach and the utility of embedding RDF into XHTML. Please feel free to share your thoughts on the wiki or email me if you have specific questions you'd like to have answered.