Jul 21 2009

The Linked Data Brand

Published by Ian Davis at 1:09 pm under Opinion and tagged as , ,

Paul Miller has kicked off a twitstorm with his simple question: does linked data require RDF?. My contention is that Linked Data does absolutely require RDF. This is not a technical issue and its not one of zealots or pragmatists: its a marketing and branding issue.

The term Linked Data was coined to brand a specific class of practices: namely assigning HTTP URIs to abitrary things and making those URIs respond with RDF relating the things to other things.

Here very few of the ‘things’ are documents, instead they are people, places, objects and concepts.

That deliberately excludes many other practices of publishing data on the web such as atom feeds, spreadsheets, APIs and even many existing RDF use cases.

The purpose of giving things a brand is to engender recognition, familiarity and trust. When you open a can of Pepsi you know exactly what you are going to get. You know you will get a great user experience whatever Apple product you buy. When you buy Lego you can rely on all the pieces fitting together.

The Linked Data brand makes similar promises of quality and consistency. When you consume Linked Data you know it will be RDF so your tools will work correctly. You also know that the data will be using HTTP URIs to refer to real-world things so you can determine what the data is about. You can trust that you’re not suddenly going to be given some XML in a proprietary schema or CSV with text headings you will have to guess the meaning of.

The Semantic Web community has been notorious for its poor marketing over the past decade. Now just when it seems the community has found the right balance between technology and mass appeal it feels like people are trying to rip away that success for their own purposes. That is deliberately emotive language because brands are all about emotion.

I don’t want to see the Linked Data brand weakened because it destroys trust. That’s why I pushed back on Twitter. As all involved know I am a huge advocate of making more data available on the web for reuse. It makes me glad whenever I see people invest their time in publishing data in any format, but my heart sings when I see more Linked Data.

There are many situations where there are better approaches than Linked Data e.g. I would rather have a midi file than the RDF version. In many circumstances I would be glad of a spreadsheet – simple and convenient.

But we should not confuse these forms of data publishing with Linked Data. That would sow confusion and be counterproductive. The coming web of data will be a rich and varied space full of content and data in every format imaginable. A large part of that we will call Linked Data and when you encounter it you will be justified in expecting RDF and HTTP.

I welcome anyone who wants to share data on the web in any way. But play fair and use the Linked Data brand only when it uses the Linked Data rules.

11 responses so far

11 Responses to “The Linked Data Brand”

  1. Michael Hausenblason 21 Jul 2009 at 3:37 pm

    +1

    A new and good argument AFAICT.

    Cheers,
    Michael

  2. Paul Milleron 21 Jul 2009 at 4:42 pm

    Hi Ian – aren’t you meant to be relaxing on a beach ?

    I agree that ‘Linked Data’ is a brand; I just think it’s a broader brand than you appear to. I have no problem with ‘Linked Data done with RDF’ or whatever, but plenty of problem with constraining a perfectly useful term unnecessarily by loading it with additional baggage/meaning.

    As Ed Summers pointed out a few days ago, the phrase in Tim’s document that got me so annoyed in the first place would appear to be a recent addition; http://twitter.com/edsu/status/2740552720

    Whoever ends up being more ‘right,’ the scope and breadth of this conversation would suggest that the definition of Linked Data is an awful lot less ‘accepted’ than you might have thought. By working through the issues here and elsewhere, maybe we’ll arrive at something that people actually do accept, understand, and embrace. And yes, that might be a definition closer to your understanding than mine. Just because I’m concerned doesn’t mean I’m right.

    To suggest, as you do, that “The Linked Data brand makes similar promises of quality and consistency” strikes me as potentially misleading, too… If we (for a moment) accept your premise that Linked Data does require RDF… that doesn’t instantly open up a wonderful world of quality, consistent, interoperating data. How could it?

    Encouraging the meaningful use of RDF is a good thing to do, and (as I keep saying) it is probably the ‘best’ way to do Linked Data today. Simply mandating RDF into the definition by fiat gives no one anything… except the false premise that all will suddenly be well if you’re ‘in’ and a sudden loss of the ground beneath your well-meaning feet if you’re not.

  3. [...] then I read Andy’s post, in which he links to various people including Ian Davis in the Linked Data Brand. Right up front, Ian [...]

  4. Ian Davison 21 Jul 2009 at 7:13 pm

    Paul, even though that phrase has been added to the definition (it doesn’t actually mandate RDF btw) you can see even in the earliest version the entire document talks only about RDF as the data carrier (alongside html).

    I’d be more inclined to your view if there were a large array of big non-RDF datasets that were interlinked. But there aren’t as far as I can see. The momentum of 3 years and hence the trusted brand is with RDF.

    There may be non-RDF datasets out there but do they crosslink the data for particular concepts? How do I disambiguate Birmingham between open streetmap, vision of britain, getty thesaurus of placenames, tiger/line, EDGAR filings and Wikitravel? Where is the mechanism and the community doing that?

  5. Eric Hellmanon 21 Jul 2009 at 7:49 pm

    People will steal the generic term “linked data” as soon as it becomes useful to do so and before they’ve bothered to find out what a bunch of technologists think it should mean. So let’s get over it already and plan out our position. We need a specific, unambiguous riposte at the ready so we can say “you may call it ‘linked data’ but it’s not ______ Linked Data”. Possible things to put in the blank: “4-rules”, “Berners-Lee”, “RDF”, “semantic web”, “w3c”.

  6. William Mougayaron 21 Jul 2009 at 10:11 pm

    Re: “The Semantic Web community has been notorious for its poor marketing over the past decade.”
    In my opinion, the kind of marketing this community needs is not around splitting implementation definitions for Linked Data, but rather on showcasing more real examples of it working, whether it’s with or without RDF.
    Echoing what Paul Miller opined, the overriding element is LINKED DATA, as it’s the lever that allows you to connect both ways. Even Tim Berners-Lee said as recently as 2 weeks ago:
    “There are other cases where the easiest thing for somebody to do is to just put data up in whatever form it’s available. Comma separated values (CSV) files are remarkably popular. They’re exported sometimes from spreadsheets. It’s remarkable how much information is in spreadsheets. Or sometimes pulled out of a database and then put up on the web. It’s not as good, not as useful to the community, as if Linked Data had been put up there and linked. But the first step of actually putting the data out there is the one that nobody else can do.”
    Source: http://www.readwriteweb.com/archives/interview_with_tim_berners-lee_part_1.php
    So, I see this as an evolutionary approach. If a CSV is an interim approach and it’s provided by SPARQL, then so what? I’m more interested in seeing what it’s actually doing, what problem it’s solving, what benefits it is realizing, and once you have that initial success, you can go back and take the more expansive route of converting to RDF, as it’s the ideal.
    Unfortunately, I’m seeing lots of attempts to boil the ocean from Day 1; but that’s not feasible nor possible. We need to showcase more modest realizations of Linked Data within confined spaces that make a strong case for further expansion.

  7. Kingsley Idehenon 22 Jul 2009 at 5:24 am

    Ian,

    The problem is that you are talking about the virtues of an EAV/CR model that carries the label “RDF” and is centered on HTTP URIs for record identifiers, record attribute identifiers, and record identifiers or literals (typed or untyped) as record attribute values . Paul on the other hand, is primarily talking about the data format RDF (typically RDF/XML). Of course, there is a broader audience that takes the “Linked Data” phrase literally without ever comprehending how HTTP URIs facilitate a more granular form of linking that is maybe best described as “hyperdata linking” within the broader context of hyperlinking or hypermedia.

    If we don’t find accurate (not simpler) language for describing the essence of the Linked Data meme, the general confusion will persist. And of course the more traditional and predictable marketing types will attempt to usurp and redefine the brand, that’s just a natural stage of commercial evolution.

    We continue to pay an unnecessarily messaging costs by straying away from terms like “data objects” and metadata. You can not describe the “Linked Data” meme succinctly, to broad audiences e.g., distributed objects, DBMS aficionados, and most pre-web developers, without the aforementioned terms.

    Remember, all innovation goes through 3 critical stages:

    1. Madness or Heresy — nobody believe you and your sanity is questioned at every turn

    2. Simple, Obvious, and Mine — people start to get it, and this is when marketeers start to redefine and claim a meme for strategic reasons (*we are right here re. Linked Data meme*)

    3. Religion — the broader Web is already here, and we are literally minutes away from the same thing re. the “Linked Data” adjunct to the existing Web (we just need to get articulation of the meme’s essence right).

    Kingsley

  8. Adam Cooperon 22 Jul 2009 at 2:00 pm

    I have a lot of sympathy for your position, Ian, although I think we have to recognise the inevitable linked-data-the-social-construct, if a little begrudgingly. I’m with Eric’s “plan out…”. Evangelists should try to be clear and consistent, though, and not play into the hands of the marketeers if we want an easier ride to something that works rather than just a convenient label for a fashion.

    I am reminded of the Creative Commons trademark policy (http://creativecommons.org/policies). If a trusted brand is sought, it probably needs this level of protection.

  9. Mike Amundsenon 22 Jul 2009 at 3:34 pm

    My problem with this issues is that “linked data” is a common phrase. It may have seemed a good idea to adopt it as a brand, but doing so puts you in the virtual position of standing on your porch and yelling at Web kids to “get off your lawn” every time someone uses it in a context not quite to your liking.

    I agree w/ Eric Hellman that a better approach would be to modify the phrase to tie it more tightly and specifically to the features that make “Linked Data” unique and valuable. That comments here still want to debate those features leads me to think the branding work is not yet complete.

  10. Greg Boutinon 23 Jul 2009 at 1:23 am

    My comment, posted here due to length and for branding reasons ;)

    http://www.semanticsincorporated.com/2009/07/if-linked-data-is-a-brand-it-has-big-problems-to-address.html

  11. [...] articles through blogs lately regarding Linked Data and the Semantic Web. Ross Bates, Paul Miller, Ian Davis, and Semantics Incorporated have all explored the ideas of Linked Data and Web 3.0/Semantic Web. [...]