Oct 20 2009
links for 2009-10-20
Comments Off
Oct 20 2009
The Linked Data design note lists four practices that lay the foundations of a web of connected data:
These practices are now well known and implemented in hundreds of datasets. However, I think it is important to realise that these are the minimum requirements for a web of data about real-world things, a starter-kit if you like. They are not the final word on what will make up a rich web of data and there are many more things we, as data publishers, could be doing.
For example, rule 3 suggests that you provide some useful RDF when someone looks up the URI of a resource. That doesn’t mean you can’t publish more RDF about that URI at a different location. If you want to assert some additional information about http://dbpedia.org/page/Decentralization then just publish a document on your site. Crucially you don’t have to persuade dbpedia.org to add your triples to their database. There is no rule that the data found at a URI is the only relevant data about that thing – it’s just one privileged portion of the total data.
That leads nicely to another example. Rule 1 suggests that you use URIs to refer to real-world things. It says nothing about how or when you should create them. The convention so far has been to mint new URIs for things rather than try to find a pre-existing one URI. That’s an acceptable practice in the bootstrapping phase where the data is sparse in but it is saving up a big integration problem for the future. I think we should be encouraging people to reuse well-known identifiers such as those in dbpedia and geonames in preference to creating new ones.
Oct 19 2009
Comments Off
Oct 14 2009
Comments Off
Oct 05 2009
So the Royal Mail are targeting threats to their monopoly on the postcode data. The blogosphere is outraged naturally, and most arguments take the stance that this is data created by a publicly owned body and that it should belong to the nation. Morally that may be true, but politically it is a very different story. Successive governments have encouraged organisations like the Royal Mail, Ordnance Survey and British Library to recoup a certain level of their costs through data licensing. We now know that stance is untenable in the face of the disruption to costs of production and distribution brought by the Web, but dinosaurs take a long time to adapt.
There have been several attempts to circumvent the Royal Mail’s monopoly by crowdsourcing the data. FreeThePostcode is one approach which has geocoded about 8000 postcodes out of 1.6M. This is after several years effort, so its not clear that this is a viable approach. I’m not even sure that the Post Office would have no claim on it even if the data is completely crowdsourced. Postcodes aren’t natural facts. They are artificial, created and assigned by the Post Office. I don’t know if that makes a real difference, but there’s enough doubt in my mind to make me worry about it.
I wonder if trying to replicate the database is simply the wrong approach. Consider OpenStreetMap: they didn’t set out to replicate the Ordnance Survey’s maps, they set out to build an entirely new map, one free from IPR claims. Their map can be used like the Ordnance Survey maps but they are entirely independent of them.
Here’s my wild idea: create a new postcode system from scratch.
It could be very simple and because it would be open data from the start it could have a real connection to the web from day one. Maybe it could be based on some algorithmic coding of data from OpenStreetMap and we could make it as granular as we like, even down to the exact house. Open data would allow hundreds of derived services to exist that are stifled by the grasp of the Post Office today.
Whenever you write a postcode on a letter add the open postcode on the next line – no harm done to anyone and a little bit more value added. If the open postcode was a number then they could be printed as barcodes on letters – a simple innovation that the closed attitude of the Post Office has prevented from happening.
A wild idea….!
Oct 04 2009