|  | Open Archives Initiative Object Reuse and Exchange |  | 
DO NOT USE THIS SPECIFICATION, see instead the CURRENT ORE SPECIFICATIONS.
This document was part of an alpha release and has been superseded.
Crawlers or harvesters must discover Resource Maps (ReMs) before the aggregations described by them can be understood. ReMs can be discovered in any number of ways and this document discusses some of the recommended discovery mechanisms. Other discovery mechanisms may evolve over time and vary based on the practices of particular communities. This user guide is one of several documents comprising the OAI-ORE specification and user guide.
1. Introduction
     1.1 Notational Conventions
2. Batch Discovery
     2.1 OAI-PMH
     2.2 SiteMaps
     2.3 Syndication Feeds
     2.4 Combining OAI-PMH with Other Approaches
3. Resource Embedding
     3.1 HTML Link Element
     3.2 HTML A and IMG Elements
     3.3 Non-HTML Resources
     3.4 Exposing in HTML Pages
4. Response Embedding
     4.1 HTTP Link Header
5. References
A. Acknowledgments
B. Change Log
Resource Map (ReMs) discovery is a precondition of use. There is no single, best method for discovering ReMs. This document covers a variety of suggested ReM discovery mechanisms, grouped into the categories of: Batch Discovery, Resource Embedding and Response Embedding and examples are explored for each category. Additional categories and examples are expected to evolve over time.
Although this document speaks of discovering ReMs, it is RECOMMENDED that agents link to the URI of the Aggregation (URI-A in [ORE Model]) since it is possible for Aggregations to have multiple Resource Maps, all of which MUST be discoverable from URI-A. See [ORE HTTP] regarding strategies for choosing URI-A and URI-R.
The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [IETF RFC 2119].
Batch discovery exists so agents can discover ReMs en masse. Note that ReMs are not limited to describing aggregations on the server where the ReMs reside. Although ReMs can be serialized in a number of formats, the initial serialization is in the Atom Syndication Format [RFC4287]. Thus, in each section a table is provided to clearly map between concepts of identification and datestamps between the transport protocol/format and the Resource Map Profile of Atom [ReMProfileofAtom].
        It is possible to define a new metadataPrefix in the Open Archives
        Initiative Protocol for Metadata Harvesting (OAI-PMH)[OAI-PMH]
        that contains ReMs.  For example, this OAI-PMH request:
        
        http://www.foo.edu/oai?verb=GetRecord&identifier=oai:foo.edu:object1&metadataPrefix=oai_rem_atom
        
Would yield this response:
<?xml version="1.0" encoding="UTF-8"?>
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/
         http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
  <responseDate>2007-02-08T08:55:46Z</responseDate>
  <request verb="GetRecord" identifier="oai:foo.edu:object1"
           metadataPrefix="oai_rem_atom">http://foo.edu/oai2</request>
  <GetRecord>
   <record>
    <header>
      <identifier>oai:foo.edu:object1</identifier>
      <datestamp>2007-01-06</datestamp>
    </header>
    <metadata>
        <!-- Insert ReM here -->
    </metadata>
  </record>
 </GetRecord>
</OAI-PMH>
| Identification | OAI-PMH record/header/identifierMUST NOT equal either ReM Atom/feed/idor/feed/link[@rel="self"]/@href | 
|---|---|
| Datestamp | OAI-PMH record/header/datestampMUST be equal to ReM Atom/feed/updated | 
It is possible to construct a SiteMap [SiteMap] that consists of just ReMs, or possibly includes ReMs in its list of regular resources. For example, dereferencing this SiteMap URI:
        http://www.foo.edu/sitemap-rem.xml
        
Would yield this response:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
   <url>
      <loc>http://www.foo.edu/objects/object1.atom#aggregation</loc>
      <lastmod>2007-01-06</lastmod>
   </url>
   <url>
      <loc>http://www.foo.edu/objects/object2.atom#aggregation</loc>
      <lastmod>2007-08-11</lastmod>
      <changefreq>weekly</changefreq>
   </url>
   <url>
      <loc>http://www.foo.edu/objects/object3.atom#aggregation</loc>
      <lastmod>2007-03-15T18:30:02Z</lastmod>
      <priority>0.3</priority>
   </url>
...
</urlset>
Note that SiteMaps have a URI path hierarchy limitation for the resources for which they can describe. For example, this SiteMap:
        http://www.foo.edu/a/b/sitemap-rem.xml
        
Can list the Aggregations:
        http://www.foo.edu/a/b/bar2.atom#aggregation
        
and
        http://www.foo.edu/a/b/c/bar3.atom#aggregation
        
But not:
        http://www.foo.edu/bar1.atom#aggregation
        
| Identification | SiteMap /urlset/url/locMUST equal/feed/link[@rel="self"]/@hrefor/feed/idfor the corresponding ReM | 
|---|---|
| Datestamp | When present, SiteMap /urlset/url/lastmodMUST be equal to ReM Atom/feed/updated | 
Even though the preliminary serialization of ReMs is in the Atom Syndication Format, there is no reason preventing the use of syndication formats such as Atom or RSS [RSS] for ReM discovery. However, care must be taken to separate conceptually the Resource Map from the syndication file listing the Resource Maps. In particular, the id of an Atom entry listing the URI of a Resource Map MUST be neither the URI of the Resource Map nor the Atom feed id of the Resource Map. Furthermore, an explicit difference must be made between the Atom feed used for discovery and the Atom feed that is the ReM. For example, this Atom Feed:
        http://www.foo.edu/all-rems.atom
        
When dereferenced would yield:
<?xml version="1.0" encoding="utf-8"?> <feed xmlns="http://www.w3.org/2005/Atom"> <title>ReMs at www.foo.edu</title> <link href="http://www.foo.edu/" /> <link href="http://www.foo.edu/all-rems.atom" rel="self"/> <updated>2007-08-15T18:30:02Z</updated> <author> <name>John Doe</name> <email>johndoe@foo.edu</email> </author> <id>urn:uuid:60a76c80-d399-11d9-b91C-0003939e0af6</id> <entry> <title>ReM For Object1</title> <link href="http://www.foo.org/objects/object1.atom#aggregation"/> <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id> <updated>2007-01-06T00:00:00Z</updated> </entry> <entry> <title>ReM For Object2</title> <link href="http://www.foo.org/objects/object2.atom#aggregation"/> <id>urn:uuid:9a2cc699-ccba-9e8b-132e-91da394e9a5c</id> <updated>2007-08-11T00:00:00Z</updated> </entry> <entry> <title>ReM For Object3</title> <link href="http://www.foo.org/objects/object3.atom#aggregation"/> <id>urn:uuid:5225c895-cab8-8ebb-baaa-90da9d4efa6b</id> <updated>2007-03-15T18:30:02Z</updated> </entry> </feed>
| Identification | Syndication Atom /feed/entry/idMUST NOT equal ReM Atom/feed/id;Syndication Atom /feed/entry/link/@hrefMUST equal ReM Atom/feed/link[@rel="self"]/@hrefor/feed/id | 
|---|---|
| Datestamp | Syndication Atom /feed/entry/updatedMUST equal ReM Atom/feed/updated | 
The same ReMs could be exposed via RSS 2.0. For example, this RSS feed:
        http://www.foo.edu/all-rems.rss
        
When dereferenced would yield:
<?xml version="1.0"?>
<rss version="2.0">
  <channel>
    <title>ReMs at www.foo.edu</title>
    <link>http://www.foo.edu/</link>
    <description>All of the Resource Maps for resources at www.foo.edu</description>
  
    <item>
      <title>ReM for Object 1</title>
      <link>http://www.foo.org/objects/object1.atom#aggregation</link>
      <description>ReM for Object 1</description>
      <pubDate>Sat, 06 Jan 2007 00:00:00 GMT</pubDate>
    </item>
  
    <item>
      <title>ReM for Object 2</title>
      <link>http://www.foo.org/objects/object2.atom#aggregation</link>
      <description>ReM for Object 2</description>
      <pubDate>Sat, 11 Aug 2007 00:00:00 GMT</pubDate>
    </item>
    <item>
      <title>ReM for Object 3</title>
      <link>http://www.foo.org/objects/object2.atom#aggregation</link>
      <description>ReM for Object 3</description>
      <pubDate>Thu, 15 Mar 2007 08:30:02 GMT</pubDate>
    </item>
   
  </channel>
</rss>
| Identification | RSS 2.0 /rss/item/linkMUST equal ReM Atom/feed/link[@rel="self"]/@hrefor/feed/id | 
|---|---|
| Datestamp | RSS 2.0 /rss/item/pubDateMUST equal ReM Atom/feed/updated(after conversion
      from RFC-822 format to ISO 8601 format) | 
        Resource Map Documents [ORE
        Model] can be included as metadata records in
        an OAI-PMH response.  However, the OAI-PMH constructs
        must be removed before the Resource Map Document can
        be used as such.  This has implications with respect
        to embedding the Resource Map in a resource (discussed below).  OAI-PMH repositories issue
        OAI-PMH responses of MIME type text/xml
        or application/xml.  These
        OAI-PMH responses must be processed into ReM responses
        (currently in Atom Syndication Format and of MIME type
        application/atom+xml).  We envision these
        services taking an OAI-PMH GetRecord request as an argument,
        such as:
        
        http://some.gateway.org/pmh2ore?=http://foo.edu/oai2?verb=GetRecord&metadataPefix=oai_rem_atom&identifier=oai:foo.edu:object1
        
        OCLC has already developed one such service.  It takes an OAI-PMH
        GetRecord URI as an argument and strips out out the OAI-PMH
        elements, leaving only the child element of the OAI-PMH's
        <metadata> element.  For example, this
        OAI-PMH GetRecord request:
        
        http://alcme.oclc.org/oaicat/OAIHandler?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:oaicat.oclc.org:2002/ocm11992160
        
        When submitted as an argument to the OCLC service, produces just the
        <oai_dc> element:
        
        http://purl.org/OAIUtil?getRecordURL=http://alcme.oclc.org/oaicat/OAIHandler?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:oaicat.oclc.org:2002/ocm11992160
        
        The values of the OAI-PMH <responseDate> 
        and <request> elements are retained as
        HTTP response headers.  The above example could also be combined
        with syndication formats.  For example, if a repository has its
        ReMs in OAI-PMH, it could export the ReMs in an Atom Feed for
        applications that are not OAI-PMH aware:
        
<?xml version="1.0" encoding="utf-8"?> <feed xmlns="http://www.w3.org/2005/Atom"> <title>ReMs at www.foo.edu</title> <link href="http://www.foo.edu/" /> <link href="http://www.foo.edu/all-rems.atom" rel="self"/> <updated>2007-08-15T18:30:02Z</updated> <author> <name>John Doe</name> <email>johndoe@foo.edu</email> </author> <id>urn:uuid:60a76c80-d399-11d9-b91C-0003939e0af6</id> <entry> <title>ReM For Object1</title> <link href="http://purl.org/OAIUtil?getRecordURL=http://foo.edu/oai2?verb=GetRecord&metadataPefix=oai_rem_atom&identifier=oai:foo.edu:object1"/> <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id> <updated>2007-01-06T00:00:00Z</updated> </entry> <entry> <title>ReM For Object2</title> <link href="http://purl.org/OAIUtil?getRecordURL=http://foo.edu/oai2?verb=GetRecord&metadataPefix=oai_rem_atom&identifier=oai:foo.edu:object1"/> <id>urn:uuid:9a2cc699-ccba-9e8b-132e-91da394e9a5c</id> <updated>2007-08-11T00:00:00Z</updated> </entry> <entry> <title>ReM For Object3</title> <link href="http://purl.org/OAIUtil?getRecordURL=http://foo.edu/oai2?verb=GetRecord&metadataPefix=oai_rem_atom&identifier=oai:foo.edu:object1"/> <id>urn:uuid:5225c895-cab8-8ebb-baaa-90da9d4efa6b</id> <updated>2007-03-15T18:30:02Z</updated> </entry> </feed>
A common scenario for ReM discovery is for a human readable page in an aggregation to link to its corresponding URI-A. This is most commonly accomplished using the HTML link element [HTML]. Alternatively, HTML A and IMG elements may point to URI-A's, or the URI-A can be exposed as an opaque string for human agents to paste into ORE-aware utilities.
We also envision the future availability of browser utilities such as Mozilla plugins that detect the presence of corresponding ReMs when embedded in resources and help guide the user in the (re)use of the aggregated resources.
The HTML link element can be used to direct agents from the aggregated HTML file to a corresponding URI-A which identifies the aggregation that is aggregating the HTML file. While this is a common case, there are actually four different scenarios regarding members of an aggregation and knowledge about their corresponding Aggregations:
Note that the above scenarios are relative to a particular ReM. It is possible for aggregated resources to simultaneously have full knowledge about one Aggregations (typically authored by the same creators of the resources) and have zero knowledge about third party Aggregations that aggregate the same resources. Below is an example of how an HTML page could link to its corresponding Aggregation. Assuming this HTML page associated JPEGs form the aggregation, and the JPEGS do not use HTTP headers to link to the corresponding ReM (see below), this is an example of a limited knowledge scenario since only this HTML page links to the Aggregation.
<html> <head> <title>Hello World.</title> <link href="http://example.net/hw.atom" type="application/atom+xml" rel="resourcemap" > </head> <body> <img src="hello.jpeg"> <img src="world.jpeg"> </html>
In the above example, the HTML page links only to a single Aggregation. It could link to multiple Aggregation, in which case it is the responsibility of the agent to differentiate the two aggregations. Next we consider an example where an HTML page is aware that it is aggregated, but does not the location of its Aggregation. Instead, it links to a page that does know the location of the Aggregation. There could be any number of these redirections. It is up to the author or maintainer of the resources and Aggregations to choose which scenario best fits their usage profile.
<html> <head> <title>Chapter Twelve.</title> <link href="http://mybook.com/toc.html" type="text/html" rel="indirectresourcemap" > </head> <body> Welcome to chapter twelve... </body> </html>
Since the HTML specification defines the values of rel attributes to be CDATA, we can use values of "resourcemap" and "indirectresourcemap" and still have valid XHTML.
A similar but different scenario is when it is desirable to acknowledge relationships to other Aggregations [ORE Model]. In this scenario, we wish to cite not the Aggregation that describes the aggregation containing the current HTML page, but rather we wish to cite the Aggregation that aggregates the resource we are linking to (with the A or IMG elements) was originally discovered. This is accomplished using a separate attribute for the A or IMG elements. The example below shows how an HTML page cites the Aggregations used to discover a PDF document about frogs and toads as well as examples images of each.
<html> ... Here is a helpful reference for distinguishing <a href="http://example.org/pics/f-t.pdf" resourcemap="http://example.org/amphibians.atom">frogs vs. toads</a>. <p> Here is a frog <img src="http://weluvfrogs.org/imgs/frog12.jpeg" resourcemap="http://frogs.org/frogs.atom"> and here is a toad <img src="http://toadsrule.org/toad.gif" resourcemap="http://toadsrule.org/toads.atom">. ... </html>
        This approach uses the non-standard attribute resourcemap.  This can be used to provide
        hints to the ORE-aware user-agent, but is not guaranteed
        to be recognized, and is not valid XHTML.  The only way to
        unambiguously link to other Aggregations is to create a new Aggregation.
        See [ORE User Guide
        Resource Map] for how to do this.
        
        Another approach to specifying the appropriate Resource Map
        without introducing a non-standard HTML attribute would be
        to place the Resource Map URI in an existing HTML attribute.
        Below is an example of how the Resource
        Map URI could be placed in the class attribute, 
        which takes a space separated list of values.
        
<html> ... Here is a helpful reference for distinguishing <a href="http://example.org/pics/f-t.pdf" class="resourcemap=http://example.org/amphibians.atom">frogs vs. toads</a>. <p> Here is a frog <img src="http://weluvfrogs.org/imgs/frog12.jpeg" class="resourcemap=http://frogs.org/frogs.atom"> and here is a toad <img src="http://toadsrule.org/toad.gif" class="resourcemap=http://toadsrule.org/toads.atom">. ... </html>
It may be possible to embed links to URI-A in non-HTML resources, such as PDF or images, but these methods are considered too preliminary to discuss at this time.
We propose exposing Aggregation URIs as opaque strings to facilitate future usage scenarios in which people copy and paste Aggregation URIs into applications such as blogs, forums or repository systems. This is commonly done with sites such as YouTube and Photobucket, and classified listings where strings are provided to the user to facilitate reuse (i.e., copy-n-paste) of the components in email, instant messaging systems, forums and HTML pages. We provide an example of how this could look for using an arXiv pre-print as an example.
If we wish to have resources link to their corresponding Aggregations, but not all of the aggregated resources are HTML, and thus cannot use the HTML link element, we can embed the link of the Aggregation in the response. For the moment, this means putting the URI of the Aggregation in an HTTP response header.
The concept of a link HTTP response header existed in earlier versions of the HTTP protocol [RFC2068], but the lack of a compelling use case probably led to it being removed from the current HTTP specification. A recent Internet Draft proposes a method for converting HTML link element semantics into HTTP Link response headers [HTTP Header Linking]. Although this draft has yet to be promoted to an RFC, the approach is straightforward. If we wanted to promote the hello world example above from limited knowledge to full knowledge, the JPEGs could link to their corresponding Aggregation with the HTTP link response header. The example below shows an HTTP request and response with the Aggregation in a link header.
(request)       HEAD http://www.example.net/hello.jpeg HTTP/1.1
                Host: www.example.net
                Connection: close
(response)      HTTP/1.1 200 OK
                Date: Sat, 26 May 2007 22:43:10 GMT
                Server: Apache/2.2.0
                Last-Modified: Sat, 26 May 2007 19:32:04 GMT
                ETag: "c3596-816-92123500"
                Accept-Ranges: bytes
                Content-Length: 2070
                Link: <http://example.net/hw.atom>; type="application/atom+xml"; rel="resourcemap"
                Content-Type: image/jpeg
                Connection: close
This document is the work of the Open Archives Initiative. Funding for Open Archives Initiative Object Reuse and Exchange is provided by the Andrew W. Mellon Foundation, Microsoft, and the National Science Foundation. Additional support is provided by the Coalition for Networked Information.
This document is based on meetings of the OAI-ORE Technical Committee (ORE-TC), with participation from the OAI-ORE Liaison Group (ORE-LG). Members of the ORE-TC are: Chris Bizer (Freie Universität Berlin), Les Carr (University of Southampton), Tim DiLauro (Johns Hopkins University), Leigh Dodds (Ingenta), David Fulker (UCAR), Tony Hammond (Nature Publishing Group), Pete Johnston (Eduserv Foundation), Richard Jones (Imperial College), Peter Murray (OhioLINK), Michael Nelson (Old Dominion University), Ray Plante (NCSA and National Virtual Observatory), Rob Sanderson (University of Liverpool), Simeon Warner (Cornell University), and Jeff Young (OCLC). Members of ORE-LG are: Leonardo Candela (DRIVER), Tim Cole (DLF Aquifer and UIUC Library), Julie Allinson (JISC), Jane Hunter (DEST), Savas Parastatidis (Microsoft), Sandy Payette (Fedora Commons), Thomas Place (DARE and University of Tilburg), Andy Powell (DCMI), and Robert Tansley (Google, Inc. and DSpace)
We also acknowledge comments from the OAI-ORE Advisory Committee (ORE-AC).

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.
Use of this page is tracked to collect anonymous traffic data. See OAI privacy policy.