Feb 28 2007

It’s Not RDF versus Microformats

Published by Ian Davis under Uncategorized and tagged as , ,

Yesterday’s post provoked a rather aggressive response from Tantek Çelik, leader of the Microformats community.

Since my post wasn’t about microformats, I’m a bit surprised at the tone of the response. I guess it must be the CSS analogy that drew Tantek’s attention. I don’t like drawing comparisons between mf and semweb since I see them as complementary technologies not opposing ones. They’re trying to solve completely different problems. I’ve seen too much abusive behaviour from both sides in the past few years: we need a little respect around here people. As I pointed out to the SWEO list last week:

I think it’s very unproductive to pitch RDF vs Microformats and neither “side” should be arrogant enough to think they have nothing to learn from the other. We need to be building bridges with the MF community.

When I visited San Francisco back in 2005 I spent a little bit of time with Tantek and some of the other microformat people over lunch and dinner. I went there as a microformat skeptic but I kept an open mind and I paid attention to what they were saying. I learnt some stuff about hidden metadata and not repeating yourself and I wanted to bring that back to the RDF community. A few fevered weeks later I’d written Embedded RDF but I’d never be arrogant enough to call it a microformat, the closest I’d allow is that it’s microformat-like. Hijacking micrformats with RDF would be incredibly disrespectful. I also happen to think the converse is true.

I’m presuming that Tantek commented on my blog to get some response and generate some debate, so I’ve pulled out most of his comment here.

Technology adoption is following the paths of least resistance: microformats is solving (has solved most of) the semantic web publishing and search problem with no apparent “chasm” as observed with other noted efforts.

I think Geoffrey Moore’s analysis applies to the adoption of disruptive technologies. Microformats aren’t disruptive, being more of a sustaining innovation and an incremental improvement over previous HTML practices. But I still think microformats are struggling to move from the early adopters to the early majority since there is still no real benefit to be had from adopting them. For most people they’re solving a problem that doesn’t need to be solved because they aren’t experiencing pain when trying to do the things microformats are supposed to make easier. It’s akin to having a business card scanner: great if you have loads of business cards and you’re an obsessive salesman, but the majority of us do just as well keeping the little bit of card on our desks or perhaps transcribing the email address to our mail program. Most people don’t need microformats because it’s not all that difficult to note down the date of an event or the address of a company.

More difficult (learning new languages), uglier (namespaces syntactical vinnegar), and simultaneously fragile (duplication of *data* in hidden files or hidden comments or hidden tags which violate the DRY principle) approaches are likely to see less and less adoption over time, in comparison to simpler, higher reliability/fidelity approaches (re-using/”salting” semantic information already being published in HTML on the web).

I don’t dispute that learning a new language is more difficult. That’s somewhat of a tautology anyway. I had to learn HTML once, and I distinctly remember thinking I must learn XML but it seemed so complex I put it off for ages (this was before namespaces). I’ve learnt CSS but I definitely can’t say I’ve mastered it.

I also don’t dispute the ugly argument and I’ve pointed out the same many times, sometimes to the frustration of my fellow semwebbers! I’ve also been heard to say that adoption of XML was the single greatest mistake in the development of RDF. There, I’ve said it again. I still believe that to be true.

I have a problem with the notion of fragility as characterised in Tantek’s comment. His assumption is that RDF involves duplication of data. I don’t see how that can be true when HTML can’t express the data that can be expressed in RDF. If your world is limited to business cards, social events and reviews then I can see where you might favour expression in HTML. Humans want to read those things too. But the world is bigger than social networking and there are other types of first-class data that people want to exchange with one another. I’d prefer to use the right tool for the job where possible.

GRDDL+XMDP+microformatted web pages (read: Semantic (X)HTML transformed) is the most likely path to providing an Uppercase Semantic Web view of the existing rapidly growing lowercase semantic (HTML) web of data for those wishing to use those tools and technologies. Simultaneously, the growth of open source libraries which provide direct access to the intrinsically semantic microformatted web is providing an alternative to using a transformed intermediate abstract model or representation.

Well, I don’t disagree with this and it’s great to have an endorsement of GRDDL as the best way to connect the HTML web of data to the RDF one.I hope, with the mention of XMDP, that this is an implied commitment to encouraging the use of profile URIs on microformatted pages.

The growth of semantics in the existing HTML web (rather than a parallel web) and the increasing diversity of tools for accessing those semantics via a variety of models is rapidly advancing the state of the art for all semantic web approaches, now, not in 3-9 years.

Again this is true, although “state of the art” doesn’t square with being mainstream. Nobody wants two webs, but many of us want the freedom to publish whatever data we have into the existing web without a centralised process.

In return, I’d like to offer Tantek some constructive suggestions (respectful I hope). Firstly, the microformats community process needs some attention. Contrarian views appear to be suppressed and the environment is very intimidating for newcomers. I have direct experience of a similar community and it pushes away many of the people who could make a valuable contribution. I have to ask if you’re views are welcome in our community, why aren’t ours welcome in yours?

Secondly, the specifications need work. In my opinon, in their current form most of them are not accessible to the majority of content authors or web developers. Who, for example, is the hCalendar specification aimed at? It appears to assume the reader is familiar with RFC2445, a particularly human-unfriendly format. What about the people who want to simply add hCalendar to their home page or their company’s site template? I think that’s a larger audience and the specification should reflect that, describing how the actual event information is incorporated into the HTML without the intermediate step of having to map to iCalendar first. If the community process were more tolerant I might feel empowered to help make those changes.

9 responses so far

Nov 17 2006

Unreal Conversations

Published by Ian Davis under Uncategorized and tagged as , , ,

If you haven’t seen Pete Lacey’s socratic dialogue on the evolution of SOAP then please go and read it straight away. An excerpt to whet your whistle:

Dev: Okay, where’s the spec on this?

SG: Oh, there is no spec. This is just what Microsoft seems to be doing. Looked like a good idea, so now all the cool kids are doing it. However, there is this new thing. I think you’re gonna like it. It’s called the Web Services Interoperability Group, or the WS-I. What they’re doing is trying to remove a lot of the ambiguity in the SOAP and WSDL specs. I know how you like specs.

Dev: So, in other words, the specs were so bad you need a standards body to standardize the standards. Lord. Well, will this solve my interoperability problems?

SG: Oh, yeah. So long as you use a WS-I compliant SOAP stack, avoid using 8/10ths of XML Schema, don’t use any unusual data types, and don’t count on working with WebSphere and Apache Axis.

There must be something in the air because Duncan Cragg has also written some fun and informative articles in the same style for a new series called the REST dialogues: Getting Data and Setting Data. If you want to get a better view of what it means to be resource-oriented then this series looks to be the business.

Comments Off

Feb 24 2006

Even Cobol

Published by Ian Davis under Uncategorized and tagged as , , , ,

Sometime you need to make extreme statements to make a subtle point clearer:

It doesn’t matter how easy it is in [awesome VisualStudio/RubyOnRails/Python/IntellijIDEA]. It has to be easy in COBOL.

Robert Sayre on the permathread that is REST vs WS-*

Comments Off

Dec 16 2005

SOAP Destined to A Life of Obscurity

Published by Ian Davis under Uncategorized and tagged as , , , , ,

This piece from Dare Obasanjo hot on the heels of the UDDI public registry closure adds weight to my suspicion that SOAP is finally being sidelined into a niche activity.

When I worked on the XML team, I used to interact regularly with the Indigo folks. At the time, I got the impression that they had two clear goals (i) build the world’s best Web services framework built on SOAP & WS-* and (ii) unify the diverse distributed computing offerings produced by Microsoft. As I spent time on my new job I realized that the first goal of Indigo folks didn’t jibe with the reality of how we built services. Despite how much various evangelists and marketing folks have tried to make it seem otherwise, SOAP based Web services aren’t the only Web service on the planet. Technically they aren’t even the most popular. If anything the most popular Web services is RSS which for all intents and purposes is a RESTful Web service. Today, across our division we have services that talk SOAP, RSS, JSON, XML-RPC and even WebDAV. The probability of all of these services being replaced by SOAP-based services is 0.

One response so far