The Great Database in the Sky

I'm baffled.

I've just watched MÃ¥rten Mickos from MySQL give a 10 minute talk on what he terms the "Great Database in the Sky" almost exactly describing the our community's vision of a "web of data" while remaining completely ignorant of the semantic web.

To start, he characterised Google as giving unstructured people access to unstructured data whereas MySQL gives structured people access to structured data, meaning that MySQL is targeted towards developers who understand how to structure data "properly". A strange polarisation in my view, but I guess he's trying to put clear blue water between the Google approach and the traditional database approach. At Talis, we don't see this distinction at all and our core platform technology, Bigfoot, unifies structured and unstructured data.

He went on to describe his vision of a skype for database access, combining my data, your data and public data into the next generation OLAP, running a trillion transactions per day. An example could be weather data and he asked what if you could run a SQL statement across all the data sources in the world, something like SELECT CurrentWindDirection, CurrentWindSpeed FROM AllTheWorldsWeatherStations, MyOwnWeatherStation, MyFriendsWeatherStation.

It's a noble goal, but he's not the first to suggest it. It's also not a future vision because you can do it today with Sparql. It's at the heart of Bigfoot and there are many other public services that can be used to learn and experiment. You can even query across HTML pages containing embedded structured data.

He followed it up by saying if this were achievable then a whole new generation of web 2.0 applications could be possible. Nothing controversial there, we share the same vision! But we think it's closer than he does.

What else? Oh yes, he said "we may need a DNS of SQL servers" and that "routing may be an issue". Another point of agreement, that's why we built a directory of data collections and services and built web services to route straight into that content.

Then, "how do you make data definitions understandable to others?". That's almost like a problem statement for RDF! And yet he didn't mention it in his list of technologies that might be candidates for the solution: RSS, Atom, Jabber, HTML, HTTP, XML, SQL and SMS.

He concluded his talk with the tagline "The data is the platform" and then took a question from the audience: "How is this different from the semantic web?".

This is where it became evident that there is a deep disconnect between the traditional database community and the semantic web community. MÃ¥rten's response was rather vague, that this wasn't as broad as the semantic web and that the semweb includes unstructured data so wasn't appropriate.

What a shame and what a failure of the semantic web community if the CEO of MySQL AB cannot see how his vision for an interconnected web of data is the same as ours! We must try harder and demonstrate at all levels the value of the semantic web approach to people like MÃ¥rten. SWEO and SWIG will help, but the convincing arguments will come from the practical applications of the semantic web being developed to solve real world problems.

Which is why I'm at Talis.


Other posts tagged as rdf, technology, web-2-0

Earlier Posts