Greg Linden, who Sam and I had the good fortune to meet at the Web 2.0 Summit this year, has posted a link to some excellent class notes on data mining. Lots of interesting topics on clustering and relevancy analysis usefully condensed and summarised. I highly recommend Greg’s blog if you’re interested at all in large-scale search and recommendation systems in general.
-
Ian Davis: British; married with kids; technical architect; CTO of Talis; co-author of RSS 1.0; creator of FOAF icons; Semantic Web hacker.

My URI:
http://iandavis.com/id/me
Email Me:
nospam@iandavis.com
Twitter:
http://twitter.com/iand Feeds
Projects

I’m interested generally in the collaborative opportunities that the web gives us, outside of simple networking (ie MySpace). While I do feel that people are becoming more cautious in publishing their true opinions (since those opinions are never truly erased, thus future employers can look them up), it’s still at the core of the open nature of the internet that there be genuine reasons for people to reach out to each other. For example, microhistory is an area where the convergence of the web and the increased number of people online can produce a massive volume of historical data that has previously been unavailable. This data could be mined at will for genealogical or even commercially-viable information. The key is for the information to remain free and open for everyone to see. My example is http://www.storyofmyhome.com, which gathers personal histories of homes and neighborhoods from the individuals who lived there in the past. In the long run, sites like this will make it possible for people to discover connections that they neve rknew existed, and that might have disappeared were it not for the site.