Implementation Guidelines for the Open Archives Initiative Protocol for Metadata Harvesting
OAI logo

Guidelines for Aggregators, Caches and Proxies

  Protocol Version 2.0 of 2002-06-14
Document Version 2005/01/19T19:27:00Z
http://www.openarchives.org/OAI/2.0/guidelines-aggregator.htm

Editors

Carl Lagoze (OAI Executive; Cornell University - Computer Science)
Herbert Van de Sompel (OAI Executive; Los Alamos National Laboratory - Research Library)
Michael Nelson (Old Dominion University - Computer Science)
Simeon Warner (Cornell University - Computer Science)

This document is one part of the Implementation Guidelines that accompany the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH).

Table of Contents

1. Introduction
2. Identifiers
3. Provenance
4. Datestamps
Acknowledgements
Document History

1. Introduction

It is expected that aggregators, caches, proxies and other third party repositories will emerge. While these services allow for sophisticated harvesting hierarchies and strategies, they also introduce a level of complexity not found in the simple service provider and data provider relationship. In particular, questions arise regarding identifier namespace and tracking the provenance of records through their travels.

2. Identifiers

Unique identifiers in OAI-PMH identify items within a repository. However, they may conform to a recognized URI scheme with greater scope. Harvesters should not assume any scope beyond the originating repository unless an identifier conforms to a recognized URI scheme.

There are three ways by which a repository can conclude that two harvested records have a provenance from a same item:

Agents which re-export harvested records should do so with different identifiers unless the metadata is unaltered and the original identifier corresponds to a recognized URI scheme. It is also recommended that all repositories re-exporting harvested records use the repeatable provenance containers to provide provenance information.

3. Provenance

It is recommended that third party repositories track the harvesting and changes to records through using provenance containers. which may be included inside the optional about parts of metadata records.

4. Datestamps

Datestamps are provided to support incremental harvesting, they are specific to a particular repository. Therefore, any service that re-exports harvested records must not preserve datestamps but instead use new, local datestamps. The provenance container may be used to record datestamps acquired when the record was harvested.

Different repositories may use different granularities for datestamps. There is no support for multiple granularities within a single repository (although repositories must interpret arguments expressed in coarser granularities than the finest they support). An aggregator should use one consistent granularity and that need not reflect the datestamp granularity of repositories that records were harvested from.

Acknowledgements

Support for the development of the OAI-PMH and for other Open Archives Initiative activities comes from the Digital Library Federation, the Coalition for Networked Information, and from the National Science Foundation through Grant No. IIS-9817416. Individuals who have played a significant role in the development of OAI-PMH version 2.0 are acknowledged in the protocol document.

Document History

2005-01-19: HTML fixes and added Table of Contents.
2002-05-10: Revised recommendations for identifiers.
2002-03-31: Release of initial version of OAI-PMH v2.0 guidelines documents.