Discussion Document: Revising the ORE Profile of Atom

Michael Nelson, Robert Sanderson, Herbert Van de Sompel

(August 1 2008)

Table of Contents

1. Summary

2. Motivation

3. Proposal

4. Examples

5. General Discussion

6. Atom entry/content and entry/link@rel="alternate"

7. Triples Discussion

8. Metadata Pertaining to the Resource Map

9. Proxies

1. Summary

This document describes a possible revision of the serialization of Resource Maps in Atom. The core characteristics of the revision are:

  1. Convey ORE semantics in Atom as add-ons/extensions to regular Atom Feeds by introducing explicit ORE relationships instead of by according ORE-specific meaning to pre-defined Atom relationship values as is the case in the current 0.9 serialization.
  2. Express an ORE Aggregation at the level of an Atom Entry not an Atom Feed; there are no ORE-specific semantics at the Feed level.

 

We understand this proposal comes very late in the ORE process that is expected to deliver 1.0 specification by the end of September 2008. Your feedback to this proposal is absolutely crucial. Please use the ORE Google Group to share your insights.

2. Motivation

The described revision is motivated by the following observations:

 

2.1 Regarding 1.1, above:

 

Best practice in the Atom community regarding the use of Atom for specific applications is to define metadata extensions and new relationship types.  The current ORE Profile of Atom has not taken this approach. Instead it uses e.g. existing, generic Atom relationships to represent specific ORE relationships.

<link rel=http://gdata.youtube.com/schemas/2007#video.related type="application/atom+xml" href="http://gdata.youtube.com/feeds/api/videos/0PKDJrIMJFs/related"/>

 

Also, the Liverpool/HP ORE experimentation project (foresite) reported problems in determining which information available for an Aggregation to map to native Atom elements, and which to map to embedded rdf:Description elements.  This is especially true when the Atom element can only occur once, yet the predicate in RDF can occur multiple times.

 

  As a result of the current approach, ORE Atom serializations are a special-purpose sub-class of Atom Feeds, and ORE semantics cannot be "tagged onto" Feeds that (also) serve a non-ORE purpose.

 

2.2 Regarding 1.2, above:

 

The functionality of the Atom Publishing Protocol (http://www.rfc-editor.org/rfc/rfc5023.txt), deemed of essential importance for leveraging ORE Aggregations, is geared towards the Atom Entry level.

Also, existing functionality to re-use blog contributions across applications/venues (e.g. Share This, Bookmark) are available at the Entry level, not Feed level.

 

The use of Atom to express assets comparable to the Aggregations of ORE commonly models assets at the Atom Entry, not Atom Feed level.

 

The Liverpool/HP ORE experimentation project (foresite) reported problems related to the length of Atom Feeds that represent large Aggregations. These problems were to a significant extent caused by the mandatory Atom elements that have to be included for each Atom Entry (each of which conveys a single Aggregated Resource in the current 0.9 serialization ).

 

The ORE Google Group discussions revealed the need to convey all Aggregations available from a repository in a variety of Feeds available from the repository. For example, subject-based Feeds, most recent-Feeds, monthly Feeds, my personal Feed, etc. In the current ORE Profile of Atom creating such repository-level Feeds requires a hierarchy of Feeds: a Feed per Aggregation from the repository, and repository-level Feeds with Entries that point at other Feeds (the Aggregations).

3. Proposal

The essence of the proposed alternative Atom serialization is as follows:

     /entry/link@rel="http://www.openarchives.org/ore/terms/aggregates"

construct.

 

The motivation for the aforementioned mapping of elements is as follows: Presume the Aggregation is a journal article, and the Atom Entry Document (Resource Map) describes the journal article:

4. Examples

A minimal example is shown below. The ORE-specific parts are highlighted in red.

 

<?xml version="1.0" encoding="UTF-8" ?>

<atom:entry

    xmlns:atom="http://www.w3.org/2005/Atom">

<atom:title>Observed Web Robot Behavior on Decaying Web Subsites</atom:title>

<atom:updated>2007-09-22T07:11:09Z</atom:updated>

<atom:author>

   <atom:name>Michael Nelson</atom:name>

   <atom:uri>http://www.cs.odu.edu/~mln/</atom:uri>

</atom:author>

<atom:author>

   <atom:name>Joan Smith</atom:name>

   <atom:uri>http://www.joanasmith.com/</atom:uri>

</atom:author>

<atom:author>

   <atom:name>Frank McCown</atom:name>

   <atom:uri>http://www.cs.odu.edu/~fmccown/</atom:uri>

</atom:author>

<atom:link  rel="alternate" type="text/html"

   href="http://www.dlib.org/dlib/february06/smith/02smith.html" />

<atom:id>http://www.dlib.org/dlib/february06/smith/aggregation</atom:id>

<atom:link rel="self"

   type="application/atom+xml"

   href="http://www.dlib.org/dlib/february06/smith/aggregation.atom" />

<atom:category scheme="http://www.openarchives.org/ore/terms/"

   term="http://www.openarchives.org/ore/terms/Aggregation"

   label="Aggregation" />

<atom:link  rel="http://www.openarchives.org/ore/terms/aggregates" type="text/html"

   href="http://www.dlib.org/dlib/february06/smith/02smith.html" />

<atom:link rel="http://www.openarchives.org/ore/terms/aggregates"  type="text/html"

   href="http://www.dlib.org/dlib/february06/smith/pg1-13.html" />

<atom:link rel="http://www.openarchives.org/ore/terms/aggregates"  type="application/pdf"

   href="http://www.dlib.org/dlib/february06/smith/pg1-13.pdf" />

<atom:link rel="http://www.openarchives.org/ore/terms/aggregates" type="image/png"

   href="http://www.dlib.org/dlib/february06/smith/MLN_Google.png" />

<atom:link rel="http://www.openarchives.org/ore/terms/aggregates"  type="application/xml"

    href="http://www.crossref.org/openurl?url_ver=Z39.88-2004&amp;rft_id=info:doi/10.1045/february2006-smith&amp;noredirect=true" />

<atom:source>

        <atom:author>

            <atom:name>Dlib-Magazine</atom:name>

            <atom:uri>http://www.dlib.org</atom:uri>

        </atom:author>

    </atom:source>

</atom:entry>

 

The same example serialized using the current 0.9 syntax is shown below. Note that /entry/title is a mandatory Atom element, and that /entry/author is inherited from the /feed level if not provided at the /entry level. Also, in the 0.9 serialization, /entry/id contains Proxy URIs.

 

<?xml version="1.0" encoding="UTF-8" ?>

<atom:feed  xmlns:atom="http://www.w3.org/2005/Atom">

<atom:id>http://www.dlib.org/dlib/february06/smith/aggregation</atom:id>

<atom:author>

   <atom:name>Michael Nelson</atom:name>

   <atom:uri>http://www.cs.odu.edu/~mln/</atom:uri>

</atom:author>

<atom:author>

   <atom:name>Joan Smith</atom:name>

   <atom:uri>http://www.joanasmith.com/</atom:uri>

</atom:author>

<atom:author>

   <atom:name>Frank McCown</atom:name>

   <atom:uri>http://www.cs.odu.edu/~fmccown/</atom:uri>

</atom:author>

<atom:title>Observed Web Robot Behavior on Decaying Web Subsites</atom:title>

<atom:category  scheme="http://www.openarchives.org/ore/terms/"

   term="http://www.openarchives.org/ore/terms/Aggregation"

   label="Aggregation" />   

<atom:link rel="self" type="application/atom+xml"

   href="http://www.dlib.org/dlib/february06/smith/aggregation.atom" />

<atom:generator uri="http://www.dlib.org">D-Lib Magazine</atom:generator>

<atom:updated>2007-09-22T07:11:09Z</atom:updated>

<atom:entry>

<atom:id>http://oreproxy.org/r?

what= http://www.dlib.org/dlib/february06/smith/02smith.html&amp;

where=http://www.dlib.org/dlib/february06/smith/aggregation</atom:id>

<atom:title>Observed Web Robot Behavior on Decaying Web Subsites</atom:title>

<atom:updated>2007-08-17T21:12:44Z</atom:updated>   

<atom:link  rel="alternate" type="text/html"

   href="http://www.dlib.org/dlib/february06/smith/02smith.html" />

</atom:entry>

<atom:entry>

<atom:id>http://oreproxy.org/r?

what=http://www.dlib.org/dlib/february06/smith/pg1-13.html&amp;

where=http://www.dlib.org/dlib/february06/smith/aggregation</atom:id>

<atom:title/>

<atom:updated>2007-08-17T21:12:44Z</atom:updated>

<atom:link  rel="alternate" type="text/html"

   href="http://www.dlib.org/dlib/february06/smith/pg1-13.html" />

</atom:entry>

<atom:entry>

<atom:id>http://oreproxy.org/r?

what=http://www.dlib.org/dlib/february06/smith/pg1-13.pdf&amp;

where=http://www.dlib.org/dlib/february06/smith/aggregation</atom:id>

<atom:title/>

<atom:updated>2007-08-17T21:12:44Z</atom:updated>

<atom:link  rel="alternate" type="application/pdf"

   href="http://www.dlib.org/dlib/february06/smith/pg1-13.pdf" />

</atom:entry>

<atom:entry>

<atom:id>http://oreproxy.org/r?

what=http://www.dlib.org/dlib/february06/smith/MLN_Google.png&amp;

where=http://www.dlib.org/dlib/february06/smith/aggregation</atom:id>

<atom:title/>

<atom:updated>2007-09-22T07:11:09Z</atom:updated>

<atom:link  rel="alternate" type="image/png"               

   href="http://www.dlib.org/dlib/february06/smith/MLN_Google.png" />

</atom:entry>

<atom:entry>

<atom:id>http://oreproxy.org/r?

what=http://www.crossref.org/openurl?url_ver=Z39.88-2004%26rft_id=info:doi/10.1045/february2006-smith%26noredirect=true&amp;

where=http://www.dlib.org/dlib/february06/smith/aggregation</atom:id>

<atom:author/>

<atom:title/>

<atom:updated>2007-08-17T21:12:44Z</atom:updated>

<atom:link rel="alternate" type="application/xml"

   href="http://www.crossref.org/openurl?url_ver=Z39.88-2004&amp;rft_id=info:doi/10.1045/february2006-smith&amp;noredirect=true" />

</atom:entry>

</atom:feed>

5. General Discussion

6. Atom entry/content and entry/link@rel="alternate"

In Atom, the use of either /entry/content or /entry/link@rel="alternate" is mandatory. Since, in this proposal, these elements have no ORE-specific semantics, the following approach is proposed to populate these elements:

  1. If a splash page exists, it is a perfect candidate for the @href value of /entry/link@rel="alternate".
  2. If no splash page exists, do one of the following:
    1. Use the URI of a chosen Aggregated Resource as the @href value of /entry/link@rel="alternate".
    2. Use the URI of the Entry document itself transformed by an "ORE-aware" XSL stylesheet as the @href value of /entry/link@rel="alternate"; one can imagine a third party service similar to oreproxy.org.
    3. Use entry/content with a meaningful content, i.e. use the content element to convey a simple HTML list of links to the Aggregated Resources. This would not require minting a new URI.

7. Triples Discussion

The proposal is not dependent on a particular approach used to convey additional metadata pertaining to Aggregation, Resource Map, and Aggregated Resources. Several approaches are possible, including the rdf:Description approach used in the 0.9 specification.  With this regard, the AtomTriples I-D (http://www.ietf.org/internet-drafts/draft-nottingham-atomtriples-00.txt) was recently published, and follow-up discussions are ongoing on the AtomPub list.  We are engaged in these discussions to ensure that the eventual solution meets our requirements. We very much favor an approach that could be generally adopted by the Atom community over an ORE-specific one. Our current proposal, inspired by the I-D and by feedback from Peter Keane is as follows:

 

<entry>

<title>Atom-Powered Robots Run Amok</title>

<link href="http://example.org/2003/12/13/atom03"/>

<id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>

<updated>2003-12-13T18:30:02Z</updated>

<summary>Some text.</summary>

<at:md subject="http://example.org/2003/12/13/atom03">

          ... metadata about http://example.org/2003/12/13/atom03 here ...

</at:md>

</entry>

 

A quite different approach would be not to address conveying additional triples in Atom, and use Atom only to convey a list of links to Aggregated Resources. All additional information could be provided in an RDF/XML Resource Map linked to from the Atom Entry using the ore:isDescribedBy relationship. This approach would be in sync with the simple manifest approach that was discussed repeatedly in the course of the ORE effort. It would also further reduce the size of Atom-based representations of Aggregations.

8. Metadata Pertaining to the Resource Map

There is an open issue regarding how to deal with metadata pertaining to the Resource Map. As discussed under Proposal, dcterms:created, dcterms:modified, and dc:rights readily map to /entry/published, /entry/updated, and /entry/rights, respectively.  Since /entry/author pertains to the Aggregation (see also under Proposal), a question remains how to convey the dcterms:creator property of the ORE Model that pertains to the Resource Map.

 

Various approaches are possible, but we propose to use /entry/source/author to convey dcterms:creator information. We feel this choice can be defended as follows:

9. Proxies

In this proposal, the use of Proxies is optional as it is in the RDF/XML serialization, and as it is in the ORE Model. This avoids the overloading of /entry/id as occurs in the current serialization. 

 

Using the approach described under Triples Discussion expressing a Proxy URI would be aschieved as follows:

 

<atom:link

   rel="http://www.openarchives.org/ore/terms/aggregates"

   type="text/html"

   href="http://www.dlib.org/dlib/february06/smith/02smith.html" />

 

<at:md

   subject="http://oreproxy.org/r?what=http://www.dlib.org/dlib/february06/smith/02smith.html&amp;where=http://www.dlib.org/dlib/february06/smith/aggregation">

<ore:proxyFor>http://www.dlib.org/dlib/february06/smith/02smith.html</ore:proxyFor>

<ore:proxyIn>http://www.dlib.org/dlib/february06/smith/aggregation</ore:proxyIn>

</at:md>