Mar 08 2007
Down With Negativity
Is it only me that thinks Mike Arrington is focussing more and more on the negative sides of the apps he reviews? Hey Mike, tell us some of the good stuff occasionally! We need some positive vibes!
Comments Off
Mar 08 2007
Is it only me that thinks Mike Arrington is focussing more and more on the negative sides of the apps he reviews? Hey Mike, tell us some of the good stuff occasionally! We need some positive vibes!
Comments Off
Mar 06 2007
Last week, at the code4lib conference, our very own Rob Styles presented a lightning talk on open data, a topic very close to our heart at Talis. He uses a great metaphor for the state of our industry, something that I can closely identify with havinf a ten year old and a four year old in the same house! It’s only 5 minutes, so you’ve no excuse for not watching it
Here’s direct link to YouTube if the embedded video player doesn’t show up.
Mar 06 2007
It looks like the hot new thing is to make a phone with both Microsoft and Google getting in on the act now.
Comments Off
Feb 28 2007
Yesterday’s post provoked a rather aggressive response from Tantek Çelik, leader of the Microformats community.
Since my post wasn’t about microformats, I’m a bit surprised at the tone of the response. I guess it must be the CSS analogy that drew Tantek’s attention. I don’t like drawing comparisons between mf and semweb since I see them as complementary technologies not opposing ones. They’re trying to solve completely different problems. I’ve seen too much abusive behaviour from both sides in the past few years: we need a little respect around here people. As I pointed out to the SWEO list last week:
I think it’s very unproductive to pitch RDF vs Microformats and neither “side” should be arrogant enough to think they have nothing to learn from the other. We need to be building bridges with the MF community.
When I visited San Francisco back in 2005 I spent a little bit of time with Tantek and some of the other microformat people over lunch and dinner. I went there as a microformat skeptic but I kept an open mind and I paid attention to what they were saying. I learnt some stuff about hidden metadata and not repeating yourself and I wanted to bring that back to the RDF community. A few fevered weeks later I’d written Embedded RDF but I’d never be arrogant enough to call it a microformat, the closest I’d allow is that it’s microformat-like. Hijacking micrformats with RDF would be incredibly disrespectful. I also happen to think the converse is true.
I’m presuming that Tantek commented on my blog to get some response and generate some debate, so I’ve pulled out most of his comment here.
Technology adoption is following the paths of least resistance: microformats is solving (has solved most of) the semantic web publishing and search problem with no apparent “chasm†as observed with other noted efforts.
I think Geoffrey Moore’s analysis applies to the adoption of disruptive technologies. Microformats aren’t disruptive, being more of a sustaining innovation and an incremental improvement over previous HTML practices. But I still think microformats are struggling to move from the early adopters to the early majority since there is still no real benefit to be had from adopting them. For most people they’re solving a problem that doesn’t need to be solved because they aren’t experiencing pain when trying to do the things microformats are supposed to make easier. It’s akin to having a business card scanner: great if you have loads of business cards and you’re an obsessive salesman, but the majority of us do just as well keeping the little bit of card on our desks or perhaps transcribing the email address to our mail program. Most people don’t need microformats because it’s not all that difficult to note down the date of an event or the address of a company.
More difficult (learning new languages), uglier (namespaces syntactical vinnegar), and simultaneously fragile (duplication of *data* in hidden files or hidden comments or hidden tags which violate the DRY principle) approaches are likely to see less and less adoption over time, in comparison to simpler, higher reliability/fidelity approaches (re-using/â€salting†semantic information already being published in HTML on the web).
I don’t dispute that learning a new language is more difficult. That’s somewhat of a tautology anyway. I had to learn HTML once, and I distinctly remember thinking I must learn XML but it seemed so complex I put it off for ages (this was before namespaces). I’ve learnt CSS but I definitely can’t say I’ve mastered it.
I also don’t dispute the ugly argument and I’ve pointed out the same many times, sometimes to the frustration of my fellow semwebbers! I’ve also been heard to say that adoption of XML was the single greatest mistake in the development of RDF. There, I’ve said it again. I still believe that to be true.
I have a problem with the notion of fragility as characterised in Tantek’s comment. His assumption is that RDF involves duplication of data. I don’t see how that can be true when HTML can’t express the data that can be expressed in RDF. If your world is limited to business cards, social events and reviews then I can see where you might favour expression in HTML. Humans want to read those things too. But the world is bigger than social networking and there are other types of first-class data that people want to exchange with one another. I’d prefer to use the right tool for the job where possible.
GRDDL+XMDP+microformatted web pages (read: Semantic (X)HTML transformed) is the most likely path to providing an Uppercase Semantic Web view of the existing rapidly growing lowercase semantic (HTML) web of data for those wishing to use those tools and technologies. Simultaneously, the growth of open source libraries which provide direct access to the intrinsically semantic microformatted web is providing an alternative to using a transformed intermediate abstract model or representation.
Well, I don’t disagree with this and it’s great to have an endorsement of GRDDL as the best way to connect the HTML web of data to the RDF one.I hope, with the mention of XMDP, that this is an implied commitment to encouraging the use of profile URIs on microformatted pages.
The growth of semantics in the existing HTML web (rather than a parallel web) and the increasing diversity of tools for accessing those semantics via a variety of models is rapidly advancing the state of the art for all semantic web approaches, now, not in 3-9 years.
Again this is true, although “state of the art” doesn’t square with being mainstream. Nobody wants two webs, but many of us want the freedom to publish whatever data we have into the existing web without a centralised process.
In return, I’d like to offer Tantek some constructive suggestions (respectful I hope). Firstly, the microformats community process needs some attention. Contrarian views appear to be suppressed and the environment is very intimidating for newcomers. I have direct experience of a similar community and it pushes away many of the people who could make a valuable contribution. I have to ask if you’re views are welcome in our community, why aren’t ours welcome in yours?
Secondly, the specifications need work. In my opinon, in their current form most of them are not accessible to the majority of content authors or web developers. Who, for example, is the hCalendar specification aimed at? It appears to assume the reader is familiar with RFC2445, a particularly human-unfriendly format. What about the people who want to simply add hCalendar to their home page or their company’s site template? I think that’s a larger audience and the specification should reflect that, describing how the actual event information is incorporated into the HTML without the intermediate step of having to map to iCalendar first. If the community process were more tolerant I might feel empowered to help make those changes.
Feb 27 2007
I was reading Elliotte Rusty Harold’s predictions for the XML world in 2007 and spotted this (which Danny has also pointed out):
2007 is the make-or-break year for the Semantic Web. The specs are done. The tools are in place, and there’s still not a whiff of a killer app anywhere to be seen. The Achilles heel of the Semantic Web may well be the complete disinterest of most authors in producing anything remotely approximating metadata for their pages. Search engines have learned to ignore any user-created metadata because honest publishers don’t bother with it and dishonest spammers abuse it. Screen readers don’t even bother with the limited semantics already in HTML, trying instead to figure out what the page looks like.
Is it really make or break for the Semantic Web? Elliotte goes on to say some nice things about GRDDL which I do agree with. But the contention that this is the Semantic Web’s last chance doesn’t riff with me. Technology, especially standards track work, takes years to cross the chasm from early adopters (the technology enthusiasts and visionaries) to the early majority (the pragmatists). And when I say years, I mean years. Take CSS for example. I’d characterise CSS as having crossed the chasm and it’s being used by the early majority and making inroads into the late majority. I don’t think anyone would seriously argue that CSS is not here to stay.
According to this semi-official history of CSS the first proposal was in 1994, about 13 years ago. The first version that was recognisably the CSS we use today was CSS1, issued by the W3C in December 1996. This was followed by CSS2 in 1998, the year that also saw the founding of the Web Standards Project. CSS 2.1 is still under development, along with portions of CSS3.
Let’s compare that with the key Semantic Web specifications: RDF and OWL. RDF emerged from earlier work by Guha called MCF and, with a heavy dose of XML courtesy of Tim Bray, it was issued as a recommendation by the W3C in February 1999. It was followed almost exactly 5 years later by a set of cleaner specifications that tidied up some loose ends and removed some cruft from the earlier specification. The OWL recommendations were issued at about the same time.
So why did CSS take so long to gain traction? 13 years from inception, 10/11 years from first accepted specification? To be honest it didn’t really solve a new problem. For most people it just solved a problem with an existing solution in a new way. HTML already allowed people to style their pages, to align elements and to layout their documents. It did it rather well for most people and for those who thought in terms of a few HTML pages CSS just seemed like a new way to do the same old thing. However, for those who needed to apply consistent formatting across a very large number of pages; or for those who wanted to be able to offer different styles for different users or media; or for those who wanted to share their designs then CSS was the clear winner.
However, if it were not for the efforts of the WaSP raising awareness of the benefits of CSS, we’d probably still be in <FONT> hell. Who can say they haven’t cursed at the “CSS Tax” of hacks and workarounds or shouted in anger when trying to achieve simple effects such as consistent font sizes or sidebars that extend to the bottom of the page. It took the patient evangelism by WaSP 9 years for us to be able to say confidently that CSS has crossed the chasm.
There are some strong analogies with the Semantic Web here. Like CSS, to most people RDF seems to offer nothing but a new way to solve an old problem. However, for those that have needs beyond a single document or a single data silo, RDF offers something genuinely different. Sharing data, combining data from different sources, evolving schemas – all these things are strengths of the RDF model and it’s only now that people are seeking to break down the walls and share data at Web scale.
So is this the year of make or break? Hardly! We could do a gross comparison with CSS and equate CSS1 (1996) with the first RDF recommendation (1999). That suggests there’s another 3 years before we can expect the Semantic Web to cross the chasm. However, I think the real comparison has to take into account the evangelism and activism around the technology. It’s taken 9 years of WaSP cajoling vendors and pestering designers to get CSS to where it is today. The analogue here is SWEO the new Semantic Web Education and Outreach group. This group (I’m a lurking member) has only just started the battle that took WaSP 9 years to win!
I’m hoping that the time frame for the Semantic Web crossing the chasm is somewhere between the two estimates; between 3 and 9 years. I’m hoping that it’s going to be at the lower end of that scale, say 5 years, but that still means we have a long multi-year struggle to evangelise and persuade. It’s going to be worth it!
Feb 23 2007
So, where to go from here? Well, this is certainly the last JSF project I will be working on, and eventually I hope to phase it out of this current one. JSF is web development for Java programmers – those who don’t know much about web applications and even less about HTML, CSS and certainly DDA.
Andrew works with me at Talis
I haven’t yet convinced everyone at Talis that frameworks built to hide the mechanics of the Web are a bad idea but it looks like that’s one less to win over.
Comments Off
Feb 19 2007
Interesting. Norm Walsh is the first person I’ve seen actively convert to hash URIs to slash URIs since the httpRange-14 decision. Most people I know have been going the other way (me included).
Unable to come up with a workaround I liked, I decided it was time for a big bang: I decided to change a whole lot of URIs. Instead of using a hashed URI to identify me (and everyone and everything else), I’d use a slashed one.
Is this the start of a trend…?
Feb 14 2007
I’m excited because I just got back from a great long weekend away walking the ancient Wiltshire landscape and found a message in my inbox confirming that my XTech proposal had been accepted. My talk and paper’s going to be on RSS Remixing – using RSS 1.0 to “augment” feeds and search results.
Here’s the abstract I submitted:
In this session I’ll demonstrate and explain a new ultra-simple protocol for augmenting search results with enrichments and related content. Similarly to OpenSearch the protocol is RSS based. But instead of sending the search terms to each registered search provider we send the results and ask the providers to inspect them and add anything else they know about the items. I’ll show how this protocol allows rich search applications to be built very simply by remixing results from several different data sources.
As Adam Bosworth pointed out, using RSS and its kin as a data transfer format makes a lot of sense when you consider the number of clients that can consume it. It is particularly useful for ordered sets of items such as the output of search engines. We used RSS to design a very simple protocol for combining results from different search engines into a single RSS feed that can be consumed by any feed client or application. In fact, because the final result is itself RSS it can in turn be used to augment results from other searches.
One untapped feature of RSS 1.0 is the potential of merging multiple RSS feeds together. The underlying RDF nature of RSS 1.0 provides this capability for free and with some work it could be adapted to work with other flavours of RSS and Atom. Using a range of techniques from simple pattern matching to Sparql querying of the full feed we can look up related items and with a few simple conventions mix the results into the original feed.
One example I’ll use will show how an RSS feed of book details can be used to build a simple book search application. Other search services can match on key information such as ISBN, authors and titles to augment the book information with jacket images, reviews, author biographies and relevant links, all drawn from different data stores.
There are, perhaps, some similarities with Yahoo’s pipes but the main distinction is the embarrassingly parallel nature of our approach plus the focus on augmenting results rather than filtering or aggregating. We used this technique to build Cenote at Talis, which is really a tiny PHP script and some XSLT leveraging the RSS remixing services we’ve implemented in the underlying platform.
Comments Off
Feb 01 2007
My friend and colleague, Paul Miller, has been a stalwart of podcasting at Talis. This weeks is particularly semwebby being a conversation with John Davies, Head of Next Generation Web Research at BT. John and Paul talk about some of the semweb projects BT is engaging in plus some of the wider misconceptions of the semantic web. I think we’ll be doing more podcasts around semweb and open data in the future. If you’re interested in being involved, have some strong opinions or want to talk about a project you’re involved with then we’d like to talk to you. Email me or Paul at podcasts@talis.com
Comments Off
Jan 30 2007
From Nick Gall’s position paper on an upcoming W3C workshop on a Web of Services for Enterprise Computing:
Actually, the W3C XML Protocol Working Group uses a much more accurate name for WS-*-style Web Services: XML Protocol Services. If WS-* “Web Services” had originally been “XML Middleware Services” (XMS-*), I doubt the W3C would have ever gotten directly involved with standardizing such an architecture. It would have left such work to the middleware vendors, and at best coordinated with them in their use of HTTP, XML, etc. Instead the W3C would have focused on protocols and formats such as RSS, Atom, Microformats, and now GData that are the best examples of how to enable one software agent to interact with another (aka A2A integration).
It is my position that the W3C should extricate itself from further direct work on SOAP, WDSL, or any other WS-* specifications and redirect its resources into evangelizing and standardizing identifiers, formats, and protocols that exemplify Web architectural principles. This includes educating enterprise application architects how to design “applications” that are “native” web applications.
Comments Off