Open Archives Initiative Object Reuse and Exchange |
DO NOT USE THIS SPECIFICATION, see instead the CURRENT ORE SPECIFICATIONS.
This document was part of a beta release and has been superseded.
Open Archives Initiative Object Reuse and Exchange (OAI-ORE) defines standards for the description and exchange of aggregations of Web resources. This document describes implementation of OAI-ORE using HTTP [RFC2616], the must widely used protocol of the current World Wide Web. Mechanisms that support multiple Resource Maps in different serializations are described in detail. This user guide is one of several documents comprising the OAI-ORE specification and user guide.
1. Introduction
2. Cool URI implementation with some HTTP server support
2.1 Cool URIs for one Resource Map describing an Aggregation
2.2 Multiple Resource Maps with Cool URIs
3. Simple implementation without server support for content negotiation or redirection
3.1 Migration from a simple implementation to support multiple Resource Maps
4. Implementation with RDFa or microformats
4.1 RDFa or microformats with Cool URIs
4.2 RDFa or microformats without server support
5. Proxy URIs
5.1 Requirements for HTTP Proxy URIs
5.2 ORE Proxy URI resolver at http://oreproxy.org/r
6. References
A. Acknowledgements
B. Change Log
The use of HTTP URIs to identify ORE Aggregations and Resource Maps leverages the extensive infrastructure and tools of the current World Wide Web [Web Architecture]. HTTP is the best supported protocol of current web browsers, crawlers, search engines, feed aggregators, and many other tools. HTTP provides mechanisms that allow the Aggregation, which is a non-information resource in the sense of the Web Architecture, to yield or redirect to a Resource Map as required by the ORE Model [Data Model]. HTTP is thus the RECOMMENDED protocol and associated URI scheme for ORE Aggregations and Resource Maps.
There may be one or more Resource Maps that describe a particular
Aggregation. These will likely differ in their serialization format
and serialization specific metadata (e.g. creation time), and are thus
separate resources from a Web Architecture standpoint. Each Resource Map
should have a different URI (ReM-1
, ReM-2
, etc.)
and it is incorrect to make multiple Resource Maps available from a single
URI via content negotiation.
In application domains such as scholarly communication, there are already many aggregations of resources on the web. These are often described by HTML "Splash Pages" such as http://arxiv.org/abs/astro-ph/0601007. which provide description of an the aggregation and access to components. Splash Pages and the URIs that identify them are NOT ORE Aggregations or Resource Maps. However, with RDFa and microformats it is possible to embed a Resource Map in a Splash Page and we discuss this case below. If there exists a Splash Page that does not contain an RDFa or microformat representation of a Resource Map then that page should not be available via content negotiation from the Aggregation.
This document is divided into sections which describe different HTTP
implementation scenarios. These scenarios differ in the server requirements
needed to support them, and in the URI structure that results.
Section 2 describes a clean and extensible implementation
strategy requiring some HTTP server support. This is the RECOMMENDED strategy.
Section 3 describes a limited but very simple
implementation strategy that requires no HTTP server support beyond the
ability to serve files.
Section 4 describes implementation with
RFDa or microformats either alone or in addition to other formats.
Finally, section 5 gives the recommended behaviour
of HTTP Proxy URIs and details of the ORE Proxy URI resolver at
http://oreproxy.org/r
.
This implementation strategy is motivated by the desire to use Cool URIs and to allow easy extensibility to new or additional serializations. We first consider the simple case of one Resource Map available to describe an Aggregation, and the mechanisms used to tie the Aggregation and Resource Map resources together. Section 2.2 then extends this to the case of multiple Resource Maps describing the same Aggregation.
Consider the following example Aggregation and Resource Map URIs:
Aggregation: A-1 = http://example.org/foo Resource Map: ReM-1 = http://example.org/foo.atom
Both A-1
and ReM-1
SHOULD be resolvable.
The Resource Map, with URI ReM-1
, is an information resource
and access SHOULD yield a representation of the Resource Map (in this case
an Atom serialization, see [Atom Resource Maps]).
The Aggregation, with URI A-1
, is described by the Resource
Map available from ReM-1
and access to A-1
SHOULD lead a user or agent to the Resource Map. There are two good mechanisms
for doing this in HTTP -- content negotiation and redirection:
A-1
, HTTP transparent content
negotiation can be used to return the Resource Map from A-1
. The
mechanism is described in RDF2295 (see also
Apache Content Negotation for an example
implementation). The key elements of the process are that when a client
requests A-1
, the server may instead respond with a Resource
Map. The response MUST include a Content-Location
header that
indicates that the response is actually from URI ReM-1
.A-1
with a
303 See Other
redirect to ReM-1
. This strategy
is described in the Linked Data Tutorial.The URIs A-1
and ReM-1
do not have to be related
in the manner shown above although this is one common arrangement and is
supported by Apache. While the appropriate
choice for a given system will likely be influenced by other considerations,
it should not be forgotten that "good URIs do not change"
[URI Style] and that later expansion is often
required as systems evolve.
If more than one Resource Map is available to describe an Aggregation,
perhaps and Atom serialization and an RDF/XML serialization, then each
Resource Map SHOULD be available from a different URI. Consider adding
ReM-2
to the example above:
Aggregation: A-1 = http://example.org/foo Resource Map: ReM-1 = http://example.org/foo.atom ReM-2 = http://example.org/foo.rdf
The Aggregation and each Resource Map has a good URI, and the
scheme is easily extensible for addition resource maps simply by
adding new Resource Maps with URI ReM-3
etc.. It is
an implementation decision as to which Resource Map is considered the
default. The serialization most useful to a simple web browser is likely
the best choice and at present that is Atom if available. Either transparent
content negotiation or redirection may be used to handle client accesses
to the Aggregation URI.
To aid in discovery, it is RECOMMENDED that where there are multiple
Resource Maps available for an Aggregation and this is known when the a
Resource Map is generated, the availability of other Resource Maps should
be indicated using the ore:isDescribedBy
predicate. For example,
ReM-1
might include the triples (shown in N3 format):
ReM-1 ore:describes A-1. A-1 ore:isDescribedBy ReM-2. #discovery of ReM-2 from ReM-1 A-1 ore:isDescribedBy ReM-3. #discovery of ReM-3 from ReM-1
Without support from a web server one cannot use the techniques above to arrange that an attempt to access the Aggregation yields or redirects to a Resource Map. A way around this limitation is to relate the URIs with a fragment identifier [RFC3986]. For example, the URIs might be:
Aggregation: A-1 = http://example.org/foo.atom#aggregation Resource Map: ReM-1 = http://example.org/foo.atom
Resolution of fragment identifiers is defined to be a client-side behavior so
any client seeing an HTTP URI with fragment identifier, e.g. uri#fragment
will remove the #fragment
and access uri
. Only when a
response is obtained might the client try to identify the correct fragment. In
practice this means that either A-1
or ReM-1
above
will yield the Resource Map at http://example.org/foo.atom
. The use
of a URI with fragment identifier to identify a non-information resource, such as
the Aggregation, is discussed further in the [Linked
Data Tutorial, Cool URIs].
Use of a fragment identifier permits precise differentiation between the Resource Map
and the Aggregation so that statements can be made about the appropriate resource.
It also satisfies the requirement that a Resource Map can be obtained both
via the Aggregation URI A-1
and directly from ReM-1
.
The use of a fragment identifier for the Aggregation URI does not directly support the availability of multiple Resource Maps for a single Aggregation. Migration from this simple approach to more complex solution with multiple serializations can be accomplished in two ways:
Change URIs to adopt the Cool URI strategy. There
is no need to change the URI for original the Resource Map
http://example.org/foo.atom
. An additional Resource Map may be
added at a new URI, say an RDF/XML Resource Map at
http://example.org/foo.rdf
, to give the following set of
URIs:
Aggregation: A-1 = http://example.org/foo Resource Map: ReM-1 = http://example.org/foo.atom ReM-2 = http://example.org/foo.rdf
With the new URI arrangement, clients attempting to access the old Aggregation
URI http://example.org/foo.atom#aggregation
would still find a
Resource Map and a sufficiently smart client might be able to unravel the inconsistency
that there is no description of the resource http://example.org/foo.atom#aggregation
.
However, the process may be made explicit by updating the Resource Maps to include
a statement that http://example.org/foo.atom
and
http://example.org/foo.atom#aggregation
identify the same resource:
<http://example.org/foo> owl:sameAs <http://example.org/foo.atom#aggregation>.
Preserve existing URIs while adding other formats. This leads to a rather
ugly and non-standard set of URIs but is otherwise straightforward. If a
new RDF/XML Resource Map were added at http://example.org/foo.rdf
the set of URIs might be:
Aggregation: A-1 = http://example.org/foo.atom#aggregation Resource Map: ReM-1 = http://example.org/foo.atom ReM-2 = http://example.org/foo.rdf
It would be possible to extend the fragment identifier scheme described in combination with content negotiation to handle multiple serializations. However, this would go against standard web practices and is NOT RECOMMENDED. The Multiple Resource Maps with Cool URIs strategy is a much better approach.
RDFa and microformats provide means to include structured data, such as a Resource Map, within an XHTML or HTML page. A profile for use of RDFa to serialize Resource Maps is given in Resource Map Implementation in RDFa. With RDFa and microformats an (X)HTML "Splash Page" may also take on the dual role of a Resource Map serialization.
Within the ORE Model, the URIs of all Resource Maps
(ReM-1
, ReM-2
etc.) MUST be distinct from
the URI of the Aggregation (A-1
). Similarly the URI of a
Splash Page (S-1
) MUST be distinct from the URI of the
Aggregation. It is RECOMMENDED that the URI of a Splash Page also be
distinct from the URI of the Resource Map if the Splash Page is itself
an Aggregated Resource. Suggested ways to do this are included in
sections 4.1 and 4.2 below.
In the case of a Cool URI implementation, the (X)HTML page with the RDFa or microformat then the URI of this page (and Resource Map) is treated in the same way as any other Resource Map URI for a given Aggregation. If the HTML page contains the only Resource Map serialization then one might have URIs:
Aggregation: A-1 = http://example.org/foo Resource Map: ReM-1 = http://example.org/foo.html (includes RDFa Resource Map)
If there are multiple serializations then the default content-negotiated result or redirect should be to the HTML page. This will ensure that a web browser receives the most helpful version of the Resource Map in response to an attempt to access the Aggregation with no preference information. If Resource Maps were available in XHTML/RDFa, Atom and RDF/XML the URIs might be:
Aggregation: A-1 = http://example.org/foo Resource Map: ReM-1 = http://example.org/foo.html (includes RDFa Resource Map) ReM-2 = http://example.org/foo.atom ReM-3 = http://example.org/foo.rdf
If the (X)HTML or Splash Page is itself part of the Aggregation
then the Splash Page and Resource Map URIs should be different. In example
set of URIs below, the fragment identifier #rem
is used
to distinguish the Resource Map from the Splash Page:
Splash Page: S-1 = http://example.org/foo.html Aggregation: A-1 = http://example.org/foo Resource Map: ReM-1 = http://example.org/foo.html#rem (includes RDFa Resource Map) ReM-2 = http://example.org/foo.atom ReM-3 = http://example.org/foo.rdf
Alternatively, the server could be configured to support completely
separate URIs S-1
and ReM-1
that yield the
same XHTML+RDFa or XHTML+microformat document:
Splash Page: S-1 = http://example.org/splash.html (access yields same XHTML+RDFa as foo.html) Aggregation: A-1 = http://example.org/foo Resource Map: ReM-1 = http://example.org/foo.html (access yields same XHTML+RDFa as splash.html) ReM-2 = http://example.org/foo.atom ReM-3 = http://example.org/foo.rdf
In case of a simple implementation without server support,
the (X)HTML page containing the RDFa or microformat Resource Map serialization
must have the Aggregation URI A-1
:
Aggregation: A-1 = http://example.org/foo.html#aggregation Resource Map: ReM-1 = http://example.org/foo.html
The RDFa or microformat data must be written so that the URIs above are used
in statements. The Aggregation URI is http://example.org/foo.html#aggregation
and not the page URI http://example.org/foo.html
.
If the (X)HTML or Splash Page is itself part of the Aggregation
then the Splash Page and Resource Map URIs should be different. In example
set of URIs below, the fragment identifier #rem
is used
to distinguish the Resource Map from the Splash Page:
Splash Page: S-1 = http://example.org/foo.html Aggregation: A-1 = http://example.org/foo.html#aggregation Resource Map: ReM-1 = http://example.org/foo.html#rem
The ORE Model [Data Model] introduces
Proxy URIs which establish
Aggregation-specific identities for Aggregated
Resources. From a modelling perspective, Proxy URIs need only be unique
to a specific Aggregation and to a specific Aggregated Resource, and
have these connections indicated with the appropriate predicates
(ore:proxyIn
, ore:proxyFor
).
It is permitted to have multiple Proxy URIs for the same Aggregated
Resource in the same Aggregation as described in different Resource Maps.
When implemented using HTTP, Proxy URIs SHOULD satisfy the additional
requirements given below so that clients
dereferencing a Proxy URI will be redirected to the Aggregated Resource while
also being informed of the Aggregation context. Conveying this information in
responses requires server support.
The ORE Proxy URI resolver provides a way to implement Proxy URIs without the need for local server support. Proxy URIs are constructed as queries to the resolver which contain both the target Aggregated Resource URI and Aggregation context URI.
Proxy URIs MUST be unique to a specific Aggregation (URI-A) and to a specific Aggregated Resource (URI-AR). They are thus able to "stand for" the Aggregated Resource in the context of the particular Aggregation. If an HTTP Proxy URI is used as a reference to an Aggregated Resources in the context of an Aggregation then it is desirable that dereferencing it with a standard web browser will return the Aggregated Resource itself (say a JPEG image or PDF document). In addition, dereference of the Proxy URI by an ORE aware client or agent should reveal the Aggregation context. In order to meet these two requirements, when dereferenced HTTP Proxy URIs MUST:
Redirect the client to the Aggregated Resource with HTTP status code "303
See Other" (other 3xx status codes do not have the correct semantics) and a
Location
header:
Location: URI-AR
Indicate the Aggregation context in the HTTP response with the
Link
header which it typed with the aggregation
relation:
Link: <URI-A>; rel="aggregation"
The ORE Proxy URI resolver is one implementation that meets these requirements. The particular syntax described below could be reused for other Proxy URI resolvers with different base URIs. With this or other syntaxes, implementers should note the URI encoding issues mentioned below.
http://oreproxy.org/r
The ORE Proxy URI resolver at http://oreproxy.org/r
is
provided as a service to the community. Use of the http://oreproxy.org/r
resolver requires only that Proxy URIs are constructed by following
the syntax rules described here. There is no need to register new Proxy
URIs or Resource Maps or Aggregations because all of the information
needed to implement the Proxy URI requirements
given above is included in the Proxy URI itself. Namely, the URIs of
the Aggregated Resource (URI-A) and the Aggregation (URI-A) context.
The syntax for the Proxy URI is:
http://oreproxy.org/r?what=URI-AR&where=URI-A
and an example might be
http://oreproxy.org/r?what=http://example.org/aggregated_resource_456&where=http://example.org/aggregation_123
Proxy URIs are constructed according to the following rules:
what
and where
MUST
be given in the order shown.The URIs of the Aggregated Resource (URI-AR) and of the Aggregation (URI-A) MUST be appropriately URI encoded as parts of the query component of the Proxy URI. All except the following characters should be percent encoded in URI-A and URI-AR when used in the Proxy URI (see URI syntax specification [RFC3986]):
query-non-escaped = ALPHA / DIGIT / "-" / "." / "_" / "~" / ":" / "@" / "/" / "?"
Note that this means that there MUST be double-escaping of
any %
characters that are already used to indicated
percent encoded characters in URI-A or URI-AR. For example,
if URI-AR=http://example.org/aggregated%26resource
and
URI-A=http://example.org/aggregation_123
, the %
in %26
must be encoded as %25
, giving:
http://oreproxy.org/r?what=http://example.org/aggregated%2526resource&where=http://example.org/aggregation_123
Note also that it is essential that the #
character be correctly
escaped (as %23
) if either URI-A or URI-AR contain a fragment
identifier component. If not, a browser would interpret the #
character as the end of the query string and not sent the rest of the proxy
URI to the resolver.
All applications except the application creating the Proxy URI and the resolver SHOULD treat the Proxy URI as opaque.
When a client dereferences a http://oreproxy.org/r
Proxy URI it
will be redirected to the Aggregated Resource (URI-A) and the Aggregation context
will be indicated in an HTTP Link
header as described in the
Proxy URI requirements above. Clients that
cannot or do not interpret the Link
header, such as an ordinary
web browser, will silently be redirected to the Aggregated Resource. ORE aware
clients will be able to deduce the Aggregation context.
This document is the work of the Open Archives Initiative. Funding for Open Archives Initiative Object Reuse and Exchange is provided by the Andrew W. Mellon Foundation, Microsoft, and the National Science Foundation. Additional support is provided by the Coalition for Networked Information.
This document is based on meetings of the OAI-ORE Technical Committee (ORE-TC), with participation from the OAI-ORE Liaison Group (ORE-LG). Members of the ORE-TC are: Chris Bizer (Freie Universität Berlin), Les Carr (University of Southampton), Tim DiLauro (Johns Hopkins University), Leigh Dodds (Ingenta), David Fulker (UCAR), Tony Hammond (Nature Publishing Group), Pete Johnston (Eduserv Foundation), Richard Jones (Imperial College), Peter Murray (OhioLINK), Michael Nelson (Old Dominion University), Ray Plante (NCSA and National Virtual Observatory), Rob Sanderson (University of Liverpool), Simeon Warner (Cornell University), and Jeff Young (OCLC). Members of ORE-LG are: Leonardo Candela (DRIVER), Tim Cole (DLF Aquifer and UIUC Library), Julie Allinson (JISC), Jane Hunter (DEST), Savas Parastatidis (Microsoft), Sandy Payette (Fedora Commons), Thomas Place (DARE and University of Tilburg), Andy Powell (DCMI), and Robert Tansley (Google, Inc. and DSpace)
We also acknowledge comments from the OAI-ORE Advisory Committee (ORE-AC).
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.
Use of this page is tracked to collect anonymous traffic data. See OAI privacy policy.