|Implementation Guidelines for the Open Archives Initiative Protocol for Metadata Harvesting|
Guidelines for Aggregators, Caches and Proxies
Protocol Version 2.0 of 2002-06-14
Document Version 2005/01/19T19:27:00Z
Cornell University - Computer Science)
Herbert Van de Sompel (OAI Executive; Los Alamos National Laboratory - Research Library)
Michael Nelson (Old Dominion University - Computer Science)
Simeon Warner (Cornell University - Computer Science)
This document is one part of the Implementation Guidelines that accompany the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH).
It is expected that aggregators, caches, proxies and other third party repositories will emerge. While these services allow for sophisticated harvesting hierarchies and strategies, they also introduce a level of complexity not found in the simple service provider and data provider relationship. In particular, questions arise regarding identifier namespace and tracking the provenance of records through their travels.
Unique identifiers in OAI-PMH identify items within a repository. However, they may conform to a recognized URI scheme with greater scope. Harvesters should not assume any scope beyond the originating repository unless an identifier conforms to a recognized URI scheme.
There are three ways by which a repository can conclude that two harvested records have a provenance from a same item:
requestelements of the OAI-PMH reponses which include the records are the same;
provenancecontainers of both records have the same entries for both the identifier and
Agents which re-export harvested records should do so with different
identifiers unless the metadata is unaltered and the original
identifier corresponds to a recognized URI scheme.
It is also recommended that all repositories re-exporting harvested
records use the repeatable
provenance containers to
provide provenance information.
It is recommended that third party repositories track the harvesting
and changes to records through using
which may be included inside the optional
about parts of
Datestamps are provided to support incremental harvesting, they are specific to
a particular repository. Therefore, any service that re-exports harvested records
must not preserve datestamps but instead use new, local datestamps. The
may be used to record datestamps acquired when the record was harvested.
Different repositories may use different granularities for datestamps. There is no support for multiple granularities within a single repository (although repositories must interpret arguments expressed in coarser granularities than the finest they support). An aggregator should use one consistent granularity and that need not reflect the datestamp granularity of repositories that records were harvested from.
Support for the development of the OAI-PMH and for other Open Archives Initiative activities comes from the Digital Library Federation, the Coalition for Networked Information, and from the National Science Foundation through Grant No. IIS-9817416. Individuals who have played a significant role in the development of OAI-PMH version 2.0 are acknowledged in the protocol document.
2005-01-19: HTML fixes and added Table of Contents.
2002-05-10: Revised recommendations for identifiers.
2002-03-31: Release of initial version of OAI-PMH v2.0 guidelines documents.