[OAI-implementers] Moving records in and out of sets

Hickey,Thom hickey@oclc.org
Thu, 23 Oct 2003 10:10:21 -0400

I think the point is not that the OAI identifier is a resource identifier,
but that it is an identifier for the metadata, and the correspondence with
the metadata should be as permanent as is practical.  Of course metadata for
a given resource might come from multiple sources, but for a given OAI
repository the id should be stable.

Currently with NDLTD, which harvests around two dozen thesis repositories,
we have been reusing the harvested IDs.  Since we harvest duplicate metadata
and are going to consolidate it into a 'work' record, we will have to start
assigning our own ID's (and try to keep them stable!).


> -----Original Message-----
> From: Tim Brody [mailto:tdb01r@ecs.soton.ac.uk]
> Sent: Thursday, October 23, 2003 7:17 AM
> To: oai-implementers@oaisrv.nsdl.cornell.edu
> Subject: Re: [OAI-implementers] Moving records in and out of sets
> Greg Lindahl wrote:
> >On Tue, Oct 21, 2003 at 09:33:59AM -0400, Caroline Arms wrote:
> >  
> >
> >>I'd like to concur with Thom that deletion/creation with a 
> new ID would
> >>"be a cure worse than the problem it is solving."  Records 
> for OAI are not
> >>usually managed independently.  The record IDs may play a 
> role in managing
> >>the content or be generated outside the OAI repository.  
> >>    
> >>
> >
> >Allow me to second that, but from another side. One way that I'd like
> >to use OAI is to take records originated at other sites, and add
> >metadatda. I'd rather have records disappear and reappear silently
> >than to change IDs, because a changed ID means that my added metadata
> >is lost. But an ideal solution would have permanent IDs, and 
> correctly
> >include information about disappearing and appeard records in
> >incremental updates.
> >
> This discussion is related to a previous one on this list:
> http://www.openarchives.org/pipermail/oai-implementers/2003-Ap

Quoting Andy: "The item identifier is not the same as the resource 
identifier - because the item is not the same as the resource."

Therefore an OAI harvester should not rely upon the OAI identifier being 
in any way persistent for the resource. In a distributed system - 
especially with author-contributed resources - it is likely that there 
will be dupes, revisions, parts under multiple OAI identifiers and 
multiple repositories using the same OAI identifiers for different 

It is the *metadata* that describes the resource, not the OAI header, 
and it is the metadata that should contain a persistent, globally unique 
identifier for that resource (e.g. a DOI in DC.identifier).

I can't think of another solution to this set maintainence bug than 
"changing id" (flagging the old OAI item deleted, creating a new OAI 
item). Creating new sets  would seem to create more problems than it 
would solve (which is moving the goal-posts on harvesters).

All the best,

OAI-implementers mailing list
List information, archives, preferences and to unsubscribe: