Dec 15 2006

Set the Controls for the Heart of the Web

Published by Ian Davis under Uncategorized and tagged as , , ,

Jim Hendler continues his exploration of the dark side of the semantic web with a must-read editorial for IEEE Intelligent Systems, the well-respected AI journal:

A key realization that Berners-Lee had with respect to the design of RDF is having unique names for different terms, with a social convention for precisely differentiating them, could in and of itself be an important addition to the Web. If you and I decide that we will use the term “http://www.cs.rpi.edu/~hendler/elephant” to designate some particular entity, then it really doesn’t matter what the other blind men think it is, they won’t be confused when they use the natural language term “elephant” which is not even close, lexigraphically, to the longer term you and I are using. And if they choose to use their own URI, “http://www.other.blind.guys.org/elephant” it won’t get confused with ours.

Too right! I paraphrase this in my RDF tutorial as Don’t use my names for your ideas, unless you mean to refer the exact same thing. Then please do!. This Wittgenstein avoidance technique was a stroke of genius: the person who creates the name gets to define it and the definitions can be in terms of other names. Anyone wanting to talk about the same thing can check the definition and if they agree with it they should feel free to use that name, if not then they can just create their own. Because there’s this way to define names in terms of one another then every new name created can potentially be related to any other. This can even be done for private interpretations of concepts that one person may consider to be close enough for their own purposes. If I want to say “cricket” and “chess” are both “games” then I can do so for my own purposes, even if other people consider one to be a “sport” and the other a “pastime”

The second key innovation that TBL made that Jim doesn’t refer to is the selection of URI syntax for the names. Arguably this is actually the more important innovation since without it the earlier unique naming technique cannot work in practice. URIs have the important property of dereferenceability which means that they can be used to fetch information about the thing the name represents. We see this every day when we type URIs starting with http:// into our web browsers. For names that someone has created to represent concepts like cricket it’s useful to send the definition when the URI is accessed. That way the user can decide whether they want to use that name for their own conversations. The use of URIs for naming is what makes the Semantic Web the Semantic Web

One response so far

Dec 14 2006

AI it aint

Published by Ian Davis under Uncategorized and tagged as , ,

It’s great to see someone like Jim Hendler get it:

…you use a small amount of Sem Web (think Foaf or Skos) to add a bit of organizational knowledge (and to webize with URIs) to tagging sites, microformats, and etc. It is the realization that the REST approach to the world is a wonderful way to use RDF and it is enpowered by the emerging standards of SPARQL, GRDDL, RDF/A and the like.

And a final flourish…

And to my AI buddies holed up in your Ivy covered towers, it’s true, I have sold out to the Dark Side — get over it!

I find all the deep OWL reasoning talk at conferences like ISWC very tiring. I mean, I find it fascinating but it’s unrealistic to expect that it’s going to be useful on any kind of web scale any time in the next 20 years. What does work today and is truly a uniqueness of the semantic web is the universal data model that RDF provides and the decentralisation of that through URIs. Yes, you can build an AI system on top of that but it was hard enough when the researchers had free reign over the data representation and execution context. I’m not sure why some think it’s going to be any easier on the Web.

What will make a difference is the volume of data available that could be accessible to the eventual AI applications. But I never got the impression that interpreting the data was the hard bit about AI.

RDF puts the web of data within our grasp and a light touch with some weak semantics will help organise this better for applications and humans to deal with. But, as I wrote a couple of days ago, the role of the human is inseparable from the web.

Don’t get me wrong, I’m a believer in strong AI and I expect a machine will one day be capable of independent thought, but I don’t think it will happen in my lifetime. Unless, that is, the Singularity occurs in the next 20 years and then I’ll have all the time in the world ;-)

3 responses so far

Nov 09 2005

ISWC2005 Notes: Day 3

Published by Ian Davis under Uncategorized and tagged as , , ,

It’s day three for me. The invited speaker this morning is Dr Alfred Spector, CTO of IBM Software. His presentation is entitled Semantic Acceleration or “The Practical Web”.

Starts off by admitting that he’s not an expert in the semantics domain but he has a very strong point of view. He always wished he was in AI when he was a systems person. Got opportunity to run IBM’s research division full of AI researchers which was great. He focussed on systems infrastructure to assist distribution of information for these researchers.

“Innovation is the intersection of invention and insight, leading to the creation of social and economic value” – US National Innovation Initiative

Still sees incredible opportunities to optimise all of society’s processes. Look at automobile technology and how it is better in every dimension compared with 20 years ago.

Information semantics will drive greatly increased value in virtually every domain. Graph of value levelling out without semantics compared with a higher plateau if semantics are introduced. How do we get metadata on every business artifact? Very difficult to do by hand – find ways to automatically generate it.

Structured information: semantics captured in schema. Unstructured information: semantics inherent in usage and context.

Where will the semantics come from? Some will be manually created; some web content generated from existing databases; however most web and enterprise data contains only latent structure, e.g. email. Manual markup hard, perhaps impossible, to scale.

Approach via text analytics – adding structure to unstructured information. However, the analytics world is fragmented and surprisingly unstructured. The right analysis will likely be a best-of-breed combination of many techniques.

IBM are pursuing a combination hypothesis: if intimately integrated various KM technologies will provide higher quality results. Combine data mining, machine learning, IR, string and graph algorithms, text analysis and NLP, UI/human factors, privacy and security. Need tens or hundreds of thousands of text analysers working together. Problem is how you get them to cooperate – this is an architectural challenge.

UIMA is an implementation of this combination technique begun in 2001. Using text, video and speech analysis tequniques with advanced concept/semantic search and knowledge representation and reasoning. Architecture informed by TIPSTER, Catalyst, Atlas, GATE, TAF, Talent and WebFountain.

UIMA software has Java and C++ framework implementations. Support for co-located and service-oriented deployments. It’s intended as an open, plug-n-play integrating framework to build an ecosystem of analysis and application developers. Will be open sourced soon. IBM making innovation grants available to catalyze efforts in this area.

Bought iPhrase which was using the UIMA system which seems an.. errr… interesting strategy for creating an ecosystem!

Comments Off

Jul 21 2004

Yahoogroups Image Tests

Published by Ian Davis under Uncategorized and tagged as

Is it just me or are the images that yahoogroups uses to verify you are a human, just stupidly difficult. It’s taken me four attempts to join a group just now because I couldn’t make sense of the images despite my eyesight being 20/20. Are the spambots really at this level of sophistication? Given my recent experience I’d put them at the same pattern recognising level as a child of 10. Just goes to show that all the innovation in AI is being done in the spam and porn industries.

Comments Off

May 08 2002

Mind and Machine

Published by Ian Davis under Uncategorized and tagged as

Mind and Machine.
A fantastic site about the inner workings of the brain with very nice
visuals.

Comments Off

May 02 2002

EvoChess2

Published by Ian Davis under Uncategorized and tagged as

EvoChess2 :

Every member of EvoChess2 can start an own evolution to breed a
variety of chess programs – called individuals. The best will survive
and produce offspring, thus inheriting the successful behavior encoded
in their genes. To speed up the global optimization progress,
established genetic code will be exchanged between EvoChess2 members.

Comments Off

Apr 16 2002

AnswerBus

Published by Ian Davis under Uncategorized and tagged as

Another answerbot, except this one seems
to be better than most at answering any of my questions.

what is the atomic weight of mercury?
The atomic number of mercury is 80; its atomic weight is 200.59.

who was the last pharoah of egypt?
Description: Information and Facts about Queen Cleopatra, last Pharoah of Egypt

what is the height of the statue of liberty
At the time the Statue of Liberty was dedicated, she was the tallest >structure in New York, reaching to a
total height of 305 feet.

Note: This entry was reconstructed from this page of testimonials since the original somehow got lost.

Comments Off

Dec 29 1999

Aglets Update

Published by Ian Davis under Uncategorized and tagged as

Back in August I wrote about the Aglet mobile agent project that was under threat of closure by IBM. Todd Papaioannou has written to update us: the Aglet source will be released to the community early next year! This should make a major difference to the mobile agent landscape.

Comments Off

Nov 30 1999

Interview With AIBO Creator

Published by Ian Davis under Uncategorized and tagged as

In this interview, Dr. Toshitada Doi, the creator of the Sony Robot Dog, describes his vision for the future of robotic pets:

Ten years from now, every household could have two or three entertainment robots like Aibo, says Doi, a number that would put the sale of these e-bots on the same scale as TVs. Dr. Doi believes that someday, people will buy as many entertainment robots as they do PCs.

Sony isn’t the only manufacturer exploring the potential of robots as household friends. Apparently Matushita is developing a range of robotic fish! Not quite in the same league as a robot dog though.

Comments Off

Nov 30 1999

Aglets Under Threat

Published by Ian Davis under Uncategorized and tagged as

The Aglet mobile agent project is under threat from IBM. Development on the project has been slowing down for several months and there are
rumours that IBM intend to kill the project off. Rather than see the
project die completely Todd Papaioannou has started a petition to release the source to the Aglets framework to the community. Aglets was one of the first Java based mobile agent technologies and has a lot of support from many developers. It suffered a setback when Danny Lange, one of the key developers left to join a rival team. It would be good to see this project opened up since so many of it’s rivals are still proprietary, closed-source efforts.

Comments Off

Next »