Google's RDFa a Damp Squib
It's been an interesting week for embedding metadata in HTML. Yesterday I was exploring html5 microdata and today Google announce support for RDFa. At first this announcement seemed like a big deal - Google supporting the web of data in a big way, a real push into the world of open structured data. However, a closer look reveals that Google have basically missed the point of RDFa. The RDFa support is limited to the properties and classes defined on a hastily thrown together site called data-vocabulary.org. There you will find classes for Person and Organization and properties for names and addresses, completely ignoring the millions of pieces of data using well established terms from FOAF and the like. That means everyone has to rewrite all their data to use Google's schema if they want to be featured on Google's search engine. Its like saying you have to write your pages using Google's own version of html where all the tags have slightly different spellings to be listed in their search engine!
The result is a hobbled implementation of RDFa. They've taken the worst part - the syntax - and thrown away the best - the decentralized vocabularies of terms. It's like using microformats without the one thing they do well: the simplicity. This is why I believe Google missed the point. They made the mistake of treating RDFa as an alternative to microformats, which completely ignores its true strength as a structured data format.
As I twittered earlier: it seems odd that Google, a company that thrived on the open messy web, seeks to ignore it and go for a controlled vocabulary. I'm hoping that this is just a toe in the water and more will come. But there's a part of me that thinks otherwise. Surely there's no way the smart people in Google didn't know about the existing vocabularies and data for people, places, reviews and businesses? We've all seen large companies claim support for key standards yet deliver partial or broken implementations and some companies use that as a deliberate tactic to undermine the standard itself, to break interoperability or make it impossibly hard. Its very easy for these situations to be explained away as a mistake, or as a work in progress, but we need to push and dig deeper and hold companies to their very public claims.