I've been doing some more digging on my fragmentation and shadow web themes and came across something I hadn't really seen before or, if I have, has been completely wiped from my mind. The RDF Concepts document contains a whole section on fragment identifiers which is worth reproducing:
RDF uses an RDF URI Reference, which may include a fragment identifier, as a context free identifier for a resource. RFC 2396 [URI] states that the meaning of a fragment identifier depends on the MIME content-type of a document, i.e. is context dependent.
These apparently conflicting views are reconciled by considering that a URI reference in an RDF graph is treated with respect to the MIME type application/rdf+xml [RDF-MIME-TYPE]. Given an RDF URI reference consisting of an absolute URI and a fragment identifier, the fragment identifer identifies the same thing that it does in an application/rdf+xml representation of the resource identified by the absolute URI component. Thus:
- we assume that the URI part (i.e. excluding fragment identifier) identifies a resource, which is presumed to have an RDF representation. So when eg:someurl#frag is used in an RDF document, eg:someurl is taken to designate some RDF document (even when no such document can be retrieved).
- eg:someurl#frag means the thing that is indicated, according to the rules of the application/rdf+xml MIME content-type as a "fragment" or "view" of the RDF document at eg:someurl. If the document does not exist, or cannot be retrieved, or is available only in formats other than application/rdf+xml, then exactly what that view may be is somewhat undetermined, but that does not prevent use of RDF to say things about it.
- the RDF treatment of a fragment identifier allows it to indicate a thing that is entirely external to the document, or even to the "shared information space" known as the Web. That is, it can be a more general idea, like some particular car or a mythical Unicorn.
- in this way, an application/rdf+xml document acts as an intermediary between some Web retrievable documents (itself, at least, also any other Web retrievable URIs that it may use, possibly including schema URIs and references to other RDF documents), and some set of possibly abstract or non-Web entities that the RDF may describe.
This provides a handling of URI references and their denotation that is consistent with the RDF model theory and usage, and also with conventional Web behavior. Note that nothing here requires that an RDF application be able to retrieve any representation of resources identified by the URIs in an RDF graph.
I've been thinking about this for a couple of days and I'm still not entirely sure what to make of it. What it appears to be saying is that RDF ignores the Web Architecture principle that fragment identifiers are given meaning by the representation that is retrieved.
So this ensures that RDF is self-consistent. I can refer to anything I like using a fragment identifier in my URI and I'm guaranteed not to have my intended meaning upset by anything messy like a network operation. This alleviates one of my major concerns at using these kinds of URIs in RDF, but at what cost? If anything this increases my concerns over the shadow web since by circumventing the web architecture it sets RDF further away from today's web of documents. For example, when I use "http://www.w3.org/TR/webarch/#media-type-fragid" as a URI in my RDF, it probably doesn't refer to the thing you think it does. You, as a human (if you are), get to see a representation of that section of the document when you click on the link, but an RDF-aware agent must treat that URI as though rdf/xml had been retrieved. Unfortunately there isn't any RDF there and the Web Architecture actually forbids you from serving up both HTML and RDF documents at the same URI.
What does that mean? How are we supposed to interpret that? One interpretation is that it really doesn't matter what you do outside of RDF. You can throw up all kinds of other representation formats and it won't affect yours or anyone else's RDF. They might use the same identifiers, and occasionally, coincidentally they may identify the same things, but in general RDF is partitioned into its own little world. RDF can only link to RDF.
How can RDF co-exist with other formats on the Web if it ignores their semantics? If you just want the Semantic Web to be built using RDF then you probably don't care. But if, like me, you want to see an inclusive Semantic Web built from a mix of RDF, microformats, topic maps, RDDL and all the other ways to express semantics, then it's a very very big problem. I don't want two webs competing for attention, I want one strong one.
Hence the title of this post. It is OK to use URIs with fragments in RDF, but only if you don't particularly care about relating to the existing web. If you do care then avoid fragments at all costs. Use standard URIs and stick 303 redirects on them if you need to. It'll work and the whole web will be better for it.