ORE User Guide

Abstract

Open Archives Initiative Object Reuse and Exchange (OAI-ORE) defines standards for the description and exchange of aggregations of Web resources. This document provides a brief overview of the abstract data model underlying these standards, serializations, implementation with HTTP, and discovery. This user guide is one of several documents comprising the OAI-ORE specification and user guide. It is intended as the place to start for first time readers.

1. Introduction

The ORE Model makes it possible to associate an identity with aggregations of web resources and to describe their structure and semantics. It does this by introducing the Resource Map (ReM), which is a resource identified by a URI (say ReM-1) that encapsulates a set of RDF statements. These statements instantiate an aggregation as a resource with a URI, enumerate the constituents of the aggregation, the relationships among those constituents, and the Web context of the aggregation. The ORE Model can be serialized in a variety of formats which will be described, along with mappings of ORE Model concepts, in companion ORE documents. The primary serialization is Atom [ORE Atom User Guide, ORE Atom Profile]. Direct RDF serialization is described in [Representing Resource Maps Using RDF Syntaxes].

2. Foundations

2.1 Web Architecture

A full description of Web Architecture concepts is contained in [Web Architecture]. For the remainder of this document, the use of the following terms from the Web Architecture SHOULD be interpreted as briefly summarized below:

Resource - an item of interest.
URI - a uniform global identifier for a Resource [URI]. This document specializes this with the notion of a Protocol-Based URI, which is a URI that can be de-referenced via a common protocol to provide access to a Representation. The most common example of this in the current implementation of the Web is HTTP [RFC2616].
Representation - a data stream corresponding to the state of a Resource at the time of a dereference of its protocol-based URI. The Web Architecture allows for multiple Representations of a Resource with access mediated by Content Negotiation.
Link - a directed connection between two Resources. In most common usage, a link is expressed via link or anchor tags (a hyperlink) in an HTML Representation of the originating Resource to the URI of another Resource. An extension of this, where links are typed relationships, is explained below.

The combination of these concepts forms what is commonly referred to as the Web Graph, with nodes that are URIs (which identify Resources), from which Representations are made available, and edges that are Links. An example of a Web graph is shown below. Note that this example shows that the Web graph is not necessarily connected - nodes O and P link to each other but not to other nodes in the graph.

Depiction of general web graph

2.2 Semantic Web and RDF

This specification also leverages Semantic Web concepts from RDF [RDF Concepts]. In RDF, Resources are described using sets of triples, each made up of three parts: a subject, a predicate and an object. The subject is a URI that identifies the described Resource; the object is either the URI of a second Resource or a literal that identifies values such as numbers and dates by means of a lexical representation; and the Predicate is a URI that identifies a type of relationship. Each triple states that a relationship of the type indicated by the Predicate (a URI) holds between the Resource identified by the subject (a URI) and the object (a URI or a Literal).

A set of RDF triples is referred to as an RDF Graph because it can be represented as a node and directed-arc diagram, in which each triple is represented as a node-arc-node link. The nodes of an RDF Graph are the subjects and objects of the constituent triples. In an RDF Graph each node is connected to at least one other node in the graph.

Note: this is a slight simplification of the RDF model because it ignores the concept of "blank nodes". The ORE model does not make use of "blank nodes" and they are not discussed further in this document

An example of an RDF Graph is shown in the figure below. As shown, the subject and Predicate of a triple are always URIs (the URI is indicated by the text in the yellow circle and shown with bracketed syntax <A> in the table) and the object may be a URI or a literal (shown as a blue rounded rectangle in the graph and in quotations in the table).

Example RDF graph and triples

Example triples in this document are show in Notation3 [N3] format, e.g.

<URI-1>  rdf:type    <T-1>.
<URI-1>  dc:creator  "Joe Bloggs".

Which means that resource URI-1 has type denoted with the URI T-1, and was created by Joe Bloggs.

A number of examples in this document are shown in N3. As this is now the primer then it would be better to recast all snippets as pictures with tables of triples where necessary (in the manner of the Data Model but with these simpler cases).

Another tool from the Semantic Web, the RDF Vocabulary Description Language [RDFS], provides the mechanisms to define vocabularies for defining the types of these relationships. In combination with the RDF-defined relationship rdf:type this vocabulary makes it possible to express types for Resources. The figure below shows an example of this. As shown, the objects of the triples with rdf:type Predicates are URIs, that denote classes or types.

Use of rdf:type

2.3 Named Graphs

Finally, this specification builds on the notion of a Named Graph [Named Graph], which extends RDF to allow the association of a name - a URI - with a set of triples - a graph. A number of aspects of Named Graphs are shown in the figure below.

The Named Graph is a Resource, identified by a URI. That URI can be the subject or object of triples. These triples can, for example, express a type for the Named Graph, or associate metadata properties (e.g., dc:creator) with the Named Graph. The figure shows a graph that represents three triples in which the URI of the Named Graph NG-A occurs as subject or object.
The Named Graph is NOT the RDF Graph itself. Instead it is a Resource with a Representation that encodes a set of triples that form the graph. The relationship between the Named Graph and the RDF Graph that its Representations encode is defined via a function rdfgraph. The semantics of a Named Graph are similar to RDF reification [RDF Semantics], in the manner that they allow the assertion of relationships between other Resources and the set of triples. This provides the basis for signing, authority, and trust. This is relevant in this specification where Named Graphs are being used to provide descriptions of citable intellectual objects.

Named Graph

2.4 Namespaces and Vocabularies

The ORE Model uses predicates from a number of vocabularies including one specific to ORE which is descibed in the ORE Vocabulary specification. In these specifications we use the following namespace prefixes.

Prefix	Namespace URI	Description
`dc`	`http://purl.org/dc/elements/1.1/`	Dublin Core elements
`dcterms`	`http://purl.org/dc/terms/`	Dublin Core terms
`ore`	`http://www.openarchives.org/ore/terms/`	ORE vocabulary terms
`owl`	`http://www.w3.org/2002/07/owl#`	OWL vocabulary terms
`rdf`	`http://www.w3.org/1999/02/22-rdf-syntax-ns#`	RDF vocabulary terms

3. Data Model

3.1 Aggregation

A Resource Map describes an Aggregation which is a set of resources, and possibly their types and relationships among the resources. Resources in the Aggregation are called Aggregated Resources.

In order to be able to talk about the Aggregation on the web, it must have a URI (say A-1). The ORE Model requires that a Resource Map describe just on Aggregation. There may be multiple Resource Maps in different formats that describe the same Aggregation. In order that applications and clients can reference the Aggregation in an actionable fashion, the URI A-1 must yield or lead to a the Resource Map when derefenced. This is likely to be achieved in on of two ways:

The URI of the Aggregation A-1 may be constructed by appending a fragment identifier #aggregation to the Resource Map URI ReM-1. For example, the Resource Map available from the URI http://sample.org/ReM-1 might describe the Aggregation http://sample.org/ReM-1#aggregation. This syntactic trick allows the creation of an Aggregation URI A-1 that correctly yeilds the corresponding Resource Map without the need for an additional infrastructure beyond a web server to return the Resource Map from URI ReM-1.
In applications where there is more control over the web infrastructure or it is desirable to serve Resource Maps in multiple formats, content negotiation or 303-style redirection may be used to link the Aggregation URI A-1 to the Resource Map. This is described in detail in FIXME_WHERE_IS_THIS_DESCRIBED, an example would be A-1 http://sample.org/A-1 which yields http://sample.org/A-1.xml or http://sample.org/A-1.rdf depending on content negotiation for Atom or RDF/XML serializations.

3.2 Resource Map

A Resource Map is obtained as a representation of the resource identified by the protocol-based URI ReM-1. The following figure shows a complete Resource Map with statements indicated as arrows from subject resource to object resource or literal. The remainder of this section explains the components of this graph step-by-step.

A complete Resource Map

The figure must be updated to use the ore:similarTo predicate instead of owl:analogousTo.

The Resource Map is identified by ReM-1 and an HTTP GET on ReM-1 must yield a serialization of the Resource Map. Note also that ReM-1 appears as a node in the figure and is the subject of several triples. First, there must be triples providing the type of the Resource Map, the type of the Aggregation, and linking the Resource Map to the Aggregation that it describes:

# mandatory, ReM-1 is a Resource Map (shown as T-1)
<ReM-1>  rdf:type            ore:ResourceMap.

# mandatory, A-1 in an Aggregation (shown as T-2)
<A-1>    rdf:type            ore:Aggregation.

# mandatory, ReM-1 describes A-1
<ReM-1>  ore:describes       <A-1>.

Some metadata about the Resource Map is mandatory, and additional metadata may optionally be expressed:

# mandatory: authoring authority and modification time of ReM
<ReM-1>  dc:creator          <http://example.org/joebloggs>.
<ReM-1>  dcterms:modified    "2007-10-15T00:00:00Z".

# optional: rights pertaining to and original creation time of ReM
<ReM-1>  dc:rights           <http://creativecommons.org/licenses/publicdomain/>.
<ReM-1>  dcterms:created     "2007-10-15T00:00:00Z".

If the Aggregation denotes an information object that has other identifiers then these are expressed using the ore:similarTo predicate:

<A-1>    ore:similarTo       <DOI-1>.

For the particular case where the ORE Aggregation is also identified by another URI then the owl:sameAs predicate my be used.

All of the Aggregated Resources are linked to the Aggregation with the ore:aggregates predicate:

<A-1>    ore:aggregates      <AR-1>.
<A-1>    ore:aggregates      <AR-2>.
<A-1>    ore:aggregates      <AR-3>.

Thus far, the Aggregation is just a bag of resources, AR-1, AR-2, and AR-3, unrelated except for their status as constituents of the Aggregation. A Resource Map may also describe the structure of the Aggregation by expressing internal relationships between the Aggregation and/or Aggregated Resources, for example:

# shown as R-1
<AR-2>   dc:hasFormat        <AR-3>.

Finally, the Resource Map may include two types of external relationships: 1) Semantic types may be associated with either the Aggregation and/or the Aggregeted Resources using the rdf:type predicate. 2) The context of the Aggregation among other other resources may be expressed using predicates in any vocabulary provided either the subject or object is the Aggregation or and Aggregated Resource.

# A-1 has type T-4 (journal article perhaps) and is part of resource A
<A-1>    rdf:type            <T-4>.
<A-1>    dcterms:isPartOf    <A>.

# AR-1 references B (perhaps another article) and has type Text
<AR-1>   dcterms:references  <B>.
<AR-3>   rdf:type            <http://purl.org/dc/dcmitype/Text>.

3.3 Relationships to other Aggregations

When reusing Resource Maps and the Aggregations that they descibe, it is important to remember the distinction between these two concepts. Statements about ReM-1 are statements about the Resource Map and not the Aggregation; statements about A-1 are statements about the intellectual object that is the Aggregation.

An Aggregated Resource may be aggregated in more than one Aggregation (say A-1 and A-2). The predicate ore:isAggregatedBy is the inverse of ore:aggregates and allows membership in another Aggregation to be expressed.

# Creator of ReM-1 knows AR-1 aggregated by A-2 as well as A-1
<AR-1>   ore:isAggregatedBy  <A-2>.

It is expected that a Resource Map describing the Aggregation A-2 can be obtained when A-2 is dereferenced.

A second use of ore:isAggregatedBy is to indicate nesting, where one Aggregation an Aggregated Resource in another Aggregation. Image that A-1 is a journal article which is part of a journal issue (Aggregation A-3). This context can be expressed in ReM-1 with the following triple.

# ReM-1 indicates that aggregation A-1 is aggregated by A-3
<A-1>    ore:isAggregatedBy  <A-3>.

4. Serialization

Need to write serialization section of primer.

5. HTTP implementation

Need to write HTTP implementation section of primer.

6. Discovery

Need to write discovery section of primer.

7. References

[N3]: http://www.w3.org/DesignIssues/Notation3.html
[ORE Data Model]: ORE Specification - Abstract Data Model, Carl Lagoze, Herbert Van de Sompel, Pete Johnston, Michael Nelson, Robert Sanderson, Simeon Warner (editors), 2008-02-26. Available at http://www.openarchives.org/ore/0.3/datamodel
[ORE Atom User Guide]: ORE User Guide - Resource Map Implementation in Atom, Carl Lagoze, Herbert Van de Sompel, Pete Johnston, Michael Nelson, Robert Sanderson, Simeon Warner (editors), 2008-02-29. Available at http://www.openarchives.org/ore/0.3/atom-implementation
[ORE Atom Profile]: ORE Specification - Resource Map Profile of Atom, Carl Lagoze, Herbert Van de Sompel, Pete Johnston, Michael Nelson, Robert Sanderson, Simeon Warner (editors), 2008-02-28. Available at http://www.openarchives.org/ore/0.3/atom
[Representing Resource Maps Using RDF Syntaxes]: ORE User Guide - Representing Resource Maps Using RDF Syntaxes, Carl Lagoze, Herbert Van de Sompel, Pete Johnston, Michael Nelson, Robert Sanderson, Simeon Warner (editors), 2008-02-29. Available at http://www.openarchives.org/ore/0.3/rdfsyntax
[Web Architecture]: Architecture of the World Wide Web, Volume One, I. Jacobs and N. Walsh, Editors, World Wide Web Consortium, 15 January 2004.

Date	Editor	Description
2008-04-02	simeon	public alpha 0.3 release
2008-03-02	simeon	public alpha 0.2 release
2008-01-08	simeon	correct N3 example
2007-12-10	simeon	public alpha 0.1 release
2007-10-15	simeon	alpha release to ORE-TC

ORE User Guide - Primer

2 April 2008