LiSA LightWeight Syndication API

LiSA is an attempt to abstract away the details of the various syndication formats such as RSS that now proliferate on the web. It's premise is that there is a core set model used by all the formats, namely a channel that contains a number of items each with a title, link and description.

Version: 1
Author: Ian Davis (nospam@iandavis.com)
Status: Draft, for review by community

Introduction and Motivation

LiSA is an attempt to abstract away the details of the various syndication formats such as RSS that now proliferate on the web. It's premise is that there is a core set model used by all the formats, namely a channel that contains a number of items each with a title, link and description.

Even though there is a common core shared among the various formats the implementation of each format is different enough to make developers think twice about supporting each version. The LiSA system is designed to normalise each syndication format to a simple stream of event notifications. The LiSA programmer need only write code to handle those events and the parser should do the hard work of detecting each format and raising the appropriate events. There are events notifying the start and end of each document, channel and item. There are also events for additional metadata such as language or publication date.

This specification is a draft, intended for comment and review by the syndication community.

LiSA was inspired by, and owes thanks to SAX

API Specification

Interfaces

All LiSA interfaces are assumed to be synchronous: the parse methods must not return until parsing is complete, and readers must wait for an event-handler callback to return before reporting the next event.

SyndicationReader - responsible for the parsing of documents

ContentHandler - implemented by the client, responsible for receiving events about the syndicated content

ErrorHandler - implemented by the client, responsible for receiving events about parse errors

SyndicationReader Interface

Parsing

Only one method is defined by this specification for parsing documents. Individual implementations may define other parse methods such as parsing from a file or URL. However all implementations MUST support the following method.

void parse(string xml) - start parsing the document contained within the string argument

Handlers

ContentHandler getContentHandler() - gets the currently registered content handler object

void setContentHandler(ContentHandler handler) - sets the content handler that the parser should use

ErrorHandler getErrorHandler() - gets the currently registered error handler object

void setErrorHandler(ErrorHandler handler) - sets the error handler that the parser should use

Extensibility

Extensibility is achieved via two mechanisms: Features and Properties. Features are settings that alter the behaviour of the parser. They are simple boolean values that enable the feature to be switched on and off. Properties are parser settings that provide information to the parser client. Property names are URLs of documents about the property.

boolean getFeature(string featureName) - retrieves the current setting of a feature

void setFeature(string featureName, boolean featureValue) - sets the specified feature on or off

object getProperty(string propertyName) - retrieves the value of the specified property

void setProperty(string propertyName, object propertyValue) - sets the value of the specified property

There are no predefined features in this version of the specification. Two properties are defined in this specification, both of which MUST be supported by conforming parsers:

http://blog.iandavis.com/lisa/properties/lisaVersion - The version of the LiSA specification supported by this parser

http://blog.iandavis.com/lisa/properties/parserVersion - The version of the parser

ContentHandler Interface

Events

void startDocument(string format, string version) - Receive notification of the start of a document. Format is set to be the case-insensitive common name for the syndication format suchas rss or newsml. The version is the commonly accepted version number for the format such as "0.91" or "1.0". Both format and version attributes may be empty. If an ErrorHandler is registered the parser MAY raise a warning before this event if either argument is empty.

void endDocument() - Receive notification of the end of a document. The parser MUST NOT raise any further events after this event.

void startChannel(string title, url link, string description) - Receive notification of the start of a channel. The parser will invoke this event once it has parsed the channel's title, description and link even if these are encountered after any extension properties. The parser MUST buffer all metadataValue events until after it has raised the startChannel event. If an ErrorHandler is registered the parser MUST raise an error event before the startChannel event if the title argument is empty. The parser MUST raise an error event before the startChannel event if the link argument is empty.

void endChannel() - Receive notification of the end of a channel. The parser MUST have raised all relevant metadataValue events for the channel before raising the endChannel event.

void startItem(string title, url link, string description) - Receive notification of the start of a channel item. The parser will invoke this event once it has parsed the item's title, description and link even if these are encountered after any extension properties. The parser MUST buffer all metadataValue events until after it has raised the startItem event. If an ErrorHandler is registered the parser MUST raise a warning event before the startItem event if the title argument is empty. The parser MUST raise a warning event before the startItem event if the link argument is empty. It is not an error for the description argument to be empty. The parser MAY raise a warning event which MUST be before the startItem event is raised.

void endItem() - Receive notification of the end of a channel item. The parser MUST have raised all relevant metadataValue events for the item before raising the endItem event.

void metadataValue(string namespaceUri, string localName, string qName, string propertyValue) - Receive notification of a metadata property for a channel or item. The parser MUST raise this event for any elements or attributes applying to the channel or item that contain only character data. The parser MUST also raise this event for any attributes of metadata group elements. The parser MUST NOT raise this event for elements containing mixed content, child elements or attributes. This event allows consumers to access simple extended properties of channels and items such as publisher names, dates or categories.

void startMetadataGroup(string namespaceUri, string localName, string qName) - Receive notification of a group of related metadata properties for a channel or item. The parser MUST raise this event for any elements applying to the channel or item that contain attributes or child elements. The parser MUST NOT raise this event for elements containg mixed content or character data.

void endMetadataGroup(string namespaceUri, string localName, string qName) - Receive notification of the end of a group of related metadata properties for a channel or item

Event Ordering

A conforming parser MUST ensure that events are generated in the following order:

startDocument - once per document

startChannel - once per channel, possibly multiple times per document

metadataValue - zero or more times for each channel

startMetadataGroup - zero or more times for each group of metadata elements

metadataValue - zero or more times for each simple element or attribute within the metadata group

endMetadataGroup - zero or more times for each group of metadata elements

startItem - zero or more times

metadataValue - zero or more times for each simple element (no attributes, contain only character data)

startMetadataGroup - zero or more times for each group of metadata elements

metadataValue - zero or more times for each simple element or attribute within the metadata group

endMetadataGroup - zero or more times for each group of metadata elements

endItem - zero or more times

endChannel - once per channel, possibly multiple times per document

endDocument - once per document

ErrorHandler Interface

void error(exception e) - Receive notification of a recoverable error

void fatalError(exception e) - Receive notification of a non-recoverable error

void warning(exception e) - Receive notification of a warning

Examples

RSS 0.91 Example

<?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?> <!-- generator=&quot;Movable Type/2.21&quot; -->
<rss version=&quot;0.91&quot;>
  <channel>
    <title>Internet Alchemy</title>
    <link>http://blog.iandavis.com/</link>
    <description>About Internet Alchemy</description>
    <language>en</language>

    <item>
      <title>Swisscom To Launch WiFi Network</title>
      <description>It looks like SwissCom are rolling out public access WiFi across
      Switzerland later this year. I wonder what kind of...</description>
      <link>http://blog.iandavis.com/2002/10/swisscomToLaunchWiFiNetwork.html</link>
    </item>

    <item>
      <title>Practical RDF Book Preview</title>
      <description>Shelley Powers is planning to offer a preview of her new RDF book
      online for technical review by the community....</description>
      <link>http://blog.iandavis.com/2002/10/practicalRDFBookPreview.html</link>
    </item>
  </channel>
</rss>

Expected events:

  1. startDocument()
  2. startChannel("Internet Alchemy", "http://blog.iandavis.com/", "About Internet Alchemy")
  3. metadataValue("", "language", "language", "en")
  4. startItem("Swisscom To Launch WiFi Network", "http://blog.iandavis.com/2002/10/swisscomToLaunchWiFiNetwork.html", "It looks like SwissCom are rolling out public access WiFi across Switzerland later this year. I wonder what kind of...")
  5. endItem()
  6. startItem("Practical RDF Book Preview", "http://blog.iandavis.com/2002/10/practicalRDFBookPreview.html", "Shelley Powers is planning to offer a preview of her new RDF book online for technical review by the community....")
  7. endItem()
  8. endChannel()
  9. endDocument()

RSS 1.0 Example

<?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?>

<rdf:RDF xmlns:rdf=&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#&quot;
xmlns:dc=&quot;http://purl.org/dc/elements/1.1/&quot;
xmlns:sy=&quot;http://purl.org/rss/1.0/modules/syndication/&quot;
xmlns:admin=&quot;http://webns.net/mvcb/&quot;
xmlns=&quot;http://purl.org/rss/1.0/&quot;>

<channel rdf:about=&quot;http://blog.iandavis.com/&quot;>
<title>Internet Alchemy</title>
<link>http://blog.iandavis.com/</link>
<description>bout Internet Alchemy</description>
<dc:language>en</dc:language>
<admin:generatorAgent rdf:resource=&quot;http://www.movabletype.org/?v=2.21&quot; />

<items>
<rdf:Seq>
<rdf:li rdf:resource=&quot;http://blog.iandavis.com/2002/10/swisscomToLaunchWiFiNetwork.html&quot; />
<rdf:li rdf:resource=&quot;http://blog.iandavis.com/2002/10/practicalRDFBookPreview.html&quot; />
</rdf:Seq>
</items>

</channel>


<item rdf:about=&quot;http://blog.iandavis.com/2002/10/swisscomToLaunchWiFiNetwork.html&quot;>
<title>Swisscom To Launch WiFi Network</title>
<description>It looks like SwissCom are rolling out public access WiFi across Switzerland
later this year. I wonder what kind of...</description>
<link>http://blog.iandavis.com/2002/10/swisscomToLaunchWiFiNetwork.html</link>
<dc:subject>Mobile Computing</dc:subject>
<dc:creator>iand</dc:creator>
<dc:date>2002-10-16T20:13:47+00:00</dc:date>
</item>

<item rdf:about=&quot;http://blog.iandavis.com/2002/10/practicalRDFBookPreview.html&quot;>
<title>Practical RDF Book Preview</title>
<description>Shelley Powers is planning to offer a preview of her new RDF book
online for technical review by the community....</description>
<link>http://blog.iandavis.com/2002/10/practicalRDFBookPreview.html</link>
<dc:subject>RDF</dc:subject>
<dc:creator>iand</dc:creator>
<dc:date>2002-10-14T16:29:59+00:00</dc:date>
</item>


</rdf:RDF>

Expected events:

  1. startDocument()
  2. startChannel("Internet Alchemy", "http://blog.iandavis.com/", "About Internet Alchemy")
  3. metadataValue("http://purl.org/dc/elements/1.1/", "language", "dc:language", "en")
  4. startMetadataGroup("http://webns.net/mvcb/", "generatorAgent", "admin:generatorAgent")
  5. metadataValue("http://www.w3.org/1999/02/22-rdf-syntax-ns#", "resource", "rdf:resource", "http://www.movabletype.org/?v=2.21")
  6. endMetadataGroup()
  7. startItem("Swisscom To Launch WiFi Network", "http://blog.iandavis.com/2002/10/swisscomToLaunchWiFiNetwork.html", "It looks like SwissCom are rolling out public access WiFi across Switzerland later this year. I wonder what kind of...")
  8. metadataValue("http://purl.org/dc/elements/1.1/", "subject", "dc:subject", "Mobile Computing")
  9. metadataValue("http://purl.org/dc/elements/1.1/", "creator", "dc:creator", "iand")
  10. metadataValue("http://purl.org/dc/elements/1.1/", "date", "dc:date", "2002-10-16T20:13:47+00:00")
  11. endItem()
  12. startItem("Practical RDF Book Preview", "http://blog.iandavis.com/2002/10/practicalRDFBookPreview.html", "Shelley Powers is planning to offer a preview of her new RDF book online for technical review by the community....")
  13. metadataValue("http://purl.org/dc/elements/1.1/", "subject", "dc:subject", "RDF")
  14. metadataValue("http://purl.org/dc/elements/1.1/", "creator", "dc:creator", "iand")
  15. metadataValue("http://purl.org/dc/elements/1.1/", "date", "dc:date", "2002-10-14T16:29:59+00:00")
  16. endItem()
  17. endChannel()
  18. endDocument()

RSS 2.0 Example

<?xml version=&quot;1.0&quot;?>
<!-- RSS generated by Radio UserLand v8.0.5 on 10/18/2002; 4:00:01 AM Pacific -->
<rss version=&quot;2.0&quot; xmlns:blogChannel=&quot;http://backend.userland.com/blogChannelModule&quot;>
        <channel>
                <title>Scripting News</title>
                <link>http://www.scripting.com/</link>
                <description>A weblog about scripting and stuff like that.</description>
                <language>en-us</language>
                <blogChannel:blogRoll>http://radio.weblogs.com/0001015/userland/scriptingNewsLeftLinks.opml</blogChannel:blogRoll>
                <blogChannel:mySubscriptions>http://radio.weblogs.com/0001015/gems/mySubscriptions.opml</blogChannel:mySubscriptions>
                <blogChannel:blink>http://diveintomark.org/</blogChannel:blink>
                <copyright>Copyright 1997-2002 Dave Winer</copyright>
                <lastBuildDate>Fri, 18 Oct 2002 11:00:01 GMT</lastBuildDate>
                <docs>http://backend.userland.com/rss</docs>
                <generator>Radio UserLand v8.0.5</generator>
                <category domain=&quot;Syndic8&quot;>1765</category>
                <managingEditor>dave@userland.com</managingEditor>
                <webMaster>dave@userland.com</webMaster>
                <ttl>15</ttl>
                <item>
                        <description>&amp;lt;a href=&amp;quot;http://philringnalda.com/archives/002356.php
                        &amp;quot;&amp;gt;Phil Ringnalda sees&amp;lt;/a&amp;gt; the need for another RSS element
                        that says &amp;quot;this channel is finished, please unsubscribe now.&amp;quot; His example
                        isYahoo's beta test of financial feeds, which is now over. Another example are the feeds
                        for discussion threads. They wind down quickly, but everyone stays subscribed unless
                        they go 404. </description>
                        <pubDate>Fri, 18 Oct 2002 10:42:38 GMT</pubDate>
                        <guid>http://scriptingnews.userland.com/backissues/2002/10/18#When:3:42:38AM</guid>
                        </item>
                <item>
                        <description>&amp;lt;a href=&amp;quot;http://news.bbc.co.uk/1/hi/business/2338325.stm
                        &amp;quot;&amp;gt;BBC&amp;lt;/a&amp;gt;: &amp;quot;Microsoft has reported far higher than expected
                        profits, helping to restore investor confidence in the battered technology sector.&amp;quot;</description>
                        <pubDate>Fri, 18 Oct 2002 10:13:15 GMT</pubDate>
                        <guid>http://scriptingnews.userland.com/backissues/2002/10/18#When:3:13:15AM</guid>
                        </item>
                </channel>
        </rss>

Expected events:

  1. startDocument()
  2. startChannel("Scripting News", "http://www.scripting.com/", "A weblog about scripting and stuff like that.")
  3. metadataValue("", "language", "language", "en-us")
  4. metadataValue("http://backend.userland.com/blogChannelModule", "blogRoll", "blogChannel:blogRoll", "http://radio.weblogs.com/0001015/userland/scriptingNewsLeftLinks.opml")
  5. metadataValue("http://backend.userland.com/blogChannelModule", "mySubscriptions", "blogChannel:mySubscriptions", "http://radio.weblogs.com/0001015/gems/mySubscriptions.opml")
  6. metadataValue("http://backend.userland.com/blogChannelModule", "blink", "blogChannel:blink", "http://diveintomark.org/")
  7. metadataValue("", "copyright", "copyright", "Copyright 1997-2002 Dave Winer")
  8. metadataValue("", "copyright", "copyright", "Copyright 1997-2002 Dave Winer")
  9. metadataValue("", "lastBuildDate", "lastBuildDate", "Fri, 18 Oct 2002 11:00:01 GMT")
  10. metadataValue("", "docs", "docs", "http://backend.userland.com/rss")
  11. metadataValue("", "generator", "generator", "Radio UserLand v8.0.5")
  12. metadataValue("", "managingEditor", "managingEditor", "dave@userland.com")
  13. metadataValue("", "webMaster", "webMaster", "dave@userland.com")
  14. metadataValue("", "ttl", "ttl", "15")
  15. startMetadataGroup("", "category", "category")
  16. metadataValue("", "domain", "domain", "Syndic8")
  17. metadataValue("", "category", "category", "1765")
  18. endMetadataGroup()
  19. warning("No title")
  20. warning("No link")
  21. startItem("", "", "&lt;a href=&quot;http://philringnalda.com/archives/002356.php&quot;&gt;Phil Ringnalda sees&lt;/a&gt; the need for another RSS element that says &quot;this channel is finished, please unsubscribe now.&quot; His example isYahoo's beta test of financial feeds, which is now over. Another example are the feeds for discussion threads. They wind down quickly, but everyone stays subscribed unless they go 404.")
  22. metadataValue("", "pubDate", "pubDate", "Fri, 18 Oct 2002 10:42:38 GMT")
  23. metadataValue("", "guid", "guid", "http://scriptingnews.userland.com/backissues/2002/10/18#When:3:42:38AM")
  24. endItem()
  25. warning("No title")
  26. warning("No link")
  27. startItem("", "", "&lt;a href=&quot;http://news.bbc.co.uk/1/hi/business/2338325.stm&quot;&gt;BBC&lt;/a&gt;: &quot;Microsoft has reported far higher than expected profits, helping to restore investor confidence in the battered technology sector.&quot;")
  28. metadataValue("", "pubDate", "pubDate", "Fri, 18 Oct 2002 10:13:15 GMT")
  29. metadataValue("", "guid", "guid", "http://scriptingnews.userland.com/backissues/2002/10/18#When:3:13:15AM")
  30. endItem()
  31. endChannel()
  32. endDocument()

Permalink: http://blog.iandavis.com/2002/10/lisa-lightweight-syndication-api/

Other posts tagged as projects, syndication

Earlier Posts