Google+

More Than The Minimum

4

20 October 2009 by Ian Davis

The Linked Data design note lists four practices that lay the foundations of a web of connected data:

  1. Use URIs as names for things
  2. Use HTTP URIs so that people can look up those names.
  3. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL)
  4. Include links to other URIs. so that they can discover more things.

These practices are now well known and implemented in hundreds of datasets. However, I think it is important to realise that these are the minimum requirements for a web of data about real-world things, a starter-kit if you like. They are not the final word on what will make up a rich web of data and there are many more things we, as data publishers, could be doing.

For example, rule 3 suggests that you provide some useful RDF when someone looks up the URI of a resource. That doesn’t mean you can’t publish more RDF about that URI at a different location. If you want to assert some additional information about http://dbpedia.org/page/Decentralization then just publish a document on your site. Crucially you don’t have to persuade dbpedia.org to add your triples to their database. There is no rule that the data found at a URI is the only relevant data about that thing – it’s just one privileged portion of the total data.

That leads nicely to another example. Rule 1 suggests that you use URIs to refer to real-world things. It says nothing about how or when you should create them. The convention so far has been to mint new URIs for things rather than try to find a pre-existing one URI. That’s an acceptable practice in the bootstrapping phase where the data is sparse in but it is saving up a big integration problem for the future. I think we should be encouraging people to reuse well-known identifiers such as those in dbpedia and geonames in preference to creating new ones.

4 thoughts on “More Than The Minimum

  1. Tweets that mention Internet Alchemy ยป More Than The Minimum -- Topsy.com says:

    […] This post was mentioned on Twitter by Ian Davis and Greg Boutin, infopeep. infopeep said: Davis, Ian: More Than The Minimum http://bit.ly/SpVTl […]

  2. Stefano Bertolo says:

    Reuse of well known identifiers is the single, relentless, objective ofhttp://www.okkam.orgOkkam has more than 5M identifiers available for reuse today. It is designed to be arbitrarily scalable. Any named entity that appears in Wikipedia is in there already. You can add yours at will.My OKKAM ID ishttp://www.okkam.org/entity/ed5bdc09-ca66-4435-b48e-6df558315fa1 and I paste it at the bottom of every piece of e-mail I send out.stefano

  3. uberVU - social comments says:

    Social comments and analytics for this post…This post was mentioned on Twitter by iand: New blog post: More Than The Minimum http://bit.ly/21NeWn

  4. Bill Roberts says:

    Hi Ian,Interesting post and interesting timing, as I published something today on a closely related theme: What makes good linked data?.I was also recommending that people reuse existing identifiers where possible. I had an interesting comment from Richard Cyganiak putting forward a different view.Bill

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 32 other followers

%d bloggers like this: