[OAI-implementers] Clarification on deleted records

Simeon Warner simeon@cs.cornell.edu
Fri, 26 Apr 2002 15:36:47 -0400 (EDT)


I can't see a reply to this, apologies if these questions were already
answered.

On Wed, 10 Apr 2002, Leigh Dodds wrote:
> I'm not clear on the following wording in the specification, could 
> someone clarify this for me?
> 
> "When returning a harvested record or identifier of a record, the  
> ListRecords and ListIdentifiers service requests may indicate a status 
> of "deleted".  This status means that an item has been deleted and 
> therefore no record can be disseminated from it.   The length of time 
> that a given repository keeps track of deleted items is not defined by 
> the protocol.  Therefore, the only guaranteed method in the protocol 
> for determining whether a record can be returned by a repository 
> (its corresponding item still exists) is through the GetRecord service 
> request."
> 
> What I'm not sure about is why the only guaranteed method 
> for determing whether a record can be returned is GetRecord.
> The second sentence (reading "This status...") notes that if 
> the status is deleted, then no record can be returned. So we 
> already have a definitive answer don't we? 

GetRecord and ListMetadataFormats are the only ways to request
information about a specific item (they have identifier as an
argument).

If a repository does not keep track of 'deleted' items and change 
the datestamp accordingly then nothing will show up in ListRecords
or ListIdentifiers when and item is deleted.
 
> Digging further I see that the GetRecord schema notes that a 
> record *can* be returned with a deleted status -- which seems 
> contrary to the above? 

Yes, a <record> can be returned (if the repository keeps track of deleted 
records), but no <metadata> block. The documentation really means 
the <metadata> block when it says 'record' here.
 
> However if an identifer doesn't exist then no record will be returned in 
> the GetRecord response.
> 
> So is it the case that records for items with a deleted status will always 
> be available (i.e. the metadata is can still be harvested) but 
> after a period (determined by the archive) GetRecord may 
> subsequently return no metadata.

For as long as a repository keeps a record of 'deleted' items, it can
return a <record> which has a <header> block but no <metadata> block. If,
after some period, it forgets about the item (stops keeping 'deleted'
status) then no <record> block will be returned.
 
(These concepts will be much better documented in v2.0 but the ideas 
are not significantly changed.)

Cheers,
Simeon.

> Thanks in advance,
> 
> L.
> 
>