Feb 28 2007
It’s Not RDF versus Microformats
Yesterday’s post provoked a rather aggressive response from Tantek Çelik, leader of the Microformats community.
Since my post wasn’t about microformats, I’m a bit surprised at the tone of the response. I guess it must be the CSS analogy that drew Tantek’s attention. I don’t like drawing comparisons between mf and semweb since I see them as complementary technologies not opposing ones. They’re trying to solve completely different problems. I’ve seen too much abusive behaviour from both sides in the past few years: we need a little respect around here people. As I pointed out to the SWEO list last week:
I think it’s very unproductive to pitch RDF vs Microformats and neither “side” should be arrogant enough to think they have nothing to learn from the other. We need to be building bridges with the MF community.
When I visited San Francisco back in 2005 I spent a little bit of time with Tantek and some of the other microformat people over lunch and dinner. I went there as a microformat skeptic but I kept an open mind and I paid attention to what they were saying. I learnt some stuff about hidden metadata and not repeating yourself and I wanted to bring that back to the RDF community. A few fevered weeks later I’d written Embedded RDF but I’d never be arrogant enough to call it a microformat, the closest I’d allow is that it’s microformat-like. Hijacking micrformats with RDF would be incredibly disrespectful. I also happen to think the converse is true.
I’m presuming that Tantek commented on my blog to get some response and generate some debate, so I’ve pulled out most of his comment here.
Technology adoption is following the paths of least resistance: microformats is solving (has solved most of) the semantic web publishing and search problem with no apparent “chasm†as observed with other noted efforts.
I think Geoffrey Moore’s analysis applies to the adoption of disruptive technologies. Microformats aren’t disruptive, being more of a sustaining innovation and an incremental improvement over previous HTML practices. But I still think microformats are struggling to move from the early adopters to the early majority since there is still no real benefit to be had from adopting them. For most people they’re solving a problem that doesn’t need to be solved because they aren’t experiencing pain when trying to do the things microformats are supposed to make easier. It’s akin to having a business card scanner: great if you have loads of business cards and you’re an obsessive salesman, but the majority of us do just as well keeping the little bit of card on our desks or perhaps transcribing the email address to our mail program. Most people don’t need microformats because it’s not all that difficult to note down the date of an event or the address of a company.
More difficult (learning new languages), uglier (namespaces syntactical vinnegar), and simultaneously fragile (duplication of *data* in hidden files or hidden comments or hidden tags which violate the DRY principle) approaches are likely to see less and less adoption over time, in comparison to simpler, higher reliability/fidelity approaches (re-using/â€salting†semantic information already being published in HTML on the web).
I don’t dispute that learning a new language is more difficult. That’s somewhat of a tautology anyway. I had to learn HTML once, and I distinctly remember thinking I must learn XML but it seemed so complex I put it off for ages (this was before namespaces). I’ve learnt CSS but I definitely can’t say I’ve mastered it.
I also don’t dispute the ugly argument and I’ve pointed out the same many times, sometimes to the frustration of my fellow semwebbers! I’ve also been heard to say that adoption of XML was the single greatest mistake in the development of RDF. There, I’ve said it again. I still believe that to be true.
I have a problem with the notion of fragility as characterised in Tantek’s comment. His assumption is that RDF involves duplication of data. I don’t see how that can be true when HTML can’t express the data that can be expressed in RDF. If your world is limited to business cards, social events and reviews then I can see where you might favour expression in HTML. Humans want to read those things too. But the world is bigger than social networking and there are other types of first-class data that people want to exchange with one another. I’d prefer to use the right tool for the job where possible.
GRDDL+XMDP+microformatted web pages (read: Semantic (X)HTML transformed) is the most likely path to providing an Uppercase Semantic Web view of the existing rapidly growing lowercase semantic (HTML) web of data for those wishing to use those tools and technologies. Simultaneously, the growth of open source libraries which provide direct access to the intrinsically semantic microformatted web is providing an alternative to using a transformed intermediate abstract model or representation.
Well, I don’t disagree with this and it’s great to have an endorsement of GRDDL as the best way to connect the HTML web of data to the RDF one.I hope, with the mention of XMDP, that this is an implied commitment to encouraging the use of profile URIs on microformatted pages.
The growth of semantics in the existing HTML web (rather than a parallel web) and the increasing diversity of tools for accessing those semantics via a variety of models is rapidly advancing the state of the art for all semantic web approaches, now, not in 3-9 years.
Again this is true, although “state of the art” doesn’t square with being mainstream. Nobody wants two webs, but many of us want the freedom to publish whatever data we have into the existing web without a centralised process.
In return, I’d like to offer Tantek some constructive suggestions (respectful I hope). Firstly, the microformats community process needs some attention. Contrarian views appear to be suppressed and the environment is very intimidating for newcomers. I have direct experience of a similar community and it pushes away many of the people who could make a valuable contribution. I have to ask if you’re views are welcome in our community, why aren’t ours welcome in yours?
Secondly, the specifications need work. In my opinon, in their current form most of them are not accessible to the majority of content authors or web developers. Who, for example, is the hCalendar specification aimed at? It appears to assume the reader is familiar with RFC2445, a particularly human-unfriendly format. What about the people who want to simply add hCalendar to their home page or their company’s site template? I think that’s a larger audience and the specification should reflect that, describing how the actual event information is incorporated into the HTML without the intermediate step of having to map to iCalendar first. If the community process were more tolerant I might feel empowered to help make those changes.
9 Responses to “It’s Not RDF versus Microformats”

[...] Original post by iand and software by Elliott Back [...]
[...] Original post by iand and plugin by Elliott Back [...]
Ian,
Thanks very much for your thorough write up and constructive suggestions in response to my comment. I’m going to be rereading this post a few times and adding a bunch of things to my microformats to-do list as a result. I’m looking forward to more “aggressive collaboration” on such efforts, and I stand by my predictions that a wide variety of semantic web approaches will rapidly blossom in the near future.
Thanks,
Tantek
I have recently added geo data to many of my photo-gallery pages. Here’s a fairly precise statement of my opinions:
Embedded-RDF and GRDDL (both of which I use) allow me to provide a route to RDF triples to say “this page is geo-located here”. Fine. As a web-developer, the code requirement is easily fulfilled and invisible to humans surfing.
Microformats (which I also use) allow me to annotate in the page-body such that browser plugins may suggest actions to take based on the data. The code requirement is not too onerous (as long as I don’t have conflicting CSS classes for any other reason) but it has page-design ramifications: I can’t see how to make the data be a (lat,long) pair but the display be a placename, for example.
So already we see differences in nature and goal of the two sides of the technology. I think the microformats approach is going to be fighting an uphill battle: requiring people to have firefox and use Operator then hoping IE starts doing it, etc… that’s a hard way to gain traction. The alternative is to invent a new class of software, “things that process eRDF and GRDDL”; do you remember when the feed aggregator landed on your desktop as a must-have-running class of utility in its own right?
It’s more than ironic that you make this point. Doesn’t this apply to RDF as well? There are already some good examples of how Microfomatting your content extends it’s reach.
Consider 3 reviews, one plain text, one marked up with hReview, and one marked up with RDFa. Which one would show up at http://kitchen.technorati.com/search/ ?
Taylor – yes you’re right up to a point. I wrote in my original post:
Tim’s “must-have-running class of utility” sounds like a good challenge to me.
Get coding Ian!
That `must-have-running utility’ may well be something like http://dbin.org/ except I’ve never got it to work on amd64. There may also be others, or web-based, for example.
Cool site. Thanks:-)
http://aborti0n.freewebpage.org/ru486/ru-486.html ru 486