[OAI-implementers] Clarification on deleted records

Hussein Suleman hussein@vt.edu
Wed, 10 Apr 2002 14:47:46 -0400


if deleted records are tracked by the archive then there is no problem - 
both ListRecords and GetRecord can return a deleted status and service 
providers can act accordingly.

if deleted records are not tracked, it is quite conceivable that an 
archive may, for example, delete a row in the database when an item is 
deleted. thus, future ListRecords requests will not reflect that the 
item is deleted (since it simply no longer exists). in such a case, the 
service provider has to regularly issue GetRecord with the identifiers 
of previously harvested records in order to confirm that each of them 
does still exist. alternatively, the service provider can issue 
GetRecord just before presenting the record through a user interface - 
that makes network utilization dependent on the needs of actual users. a 
third solution is to just throw everything away every now and then and 
reharvest from scratch.

to answer your last question, if an archive tracks deleted records, 
GetRecord (or ListRecords) for a deleted item will return only the 
header and not the metadata (or about). any archive that tracks deleted 
items ought to store at least the identifiers and the dates when the 
items were deleted or harvesting will not work properly.

hope this helps.


Leigh Dodds wrote:

> Hi,
> I'm not clear on the following wording in the specification, could 
> someone clarify this for me?
> "When returning a harvested record or identifier of a record, the  
> ListRecords and ListIdentifiers service requests may indicate a status 
> of "deleted".  This status means that an item has been deleted and 
> therefore no record can be disseminated from it.   The length of time 
> that a given repository keeps track of deleted items is not defined by 
> the protocol.  Therefore, the only guaranteed method in the protocol 
> for determining whether a record can be returned by a repository 
> (its corresponding item still exists) is through the GetRecord service 
> request."
> What I'm not sure about is why the only guaranteed method 
> for determing whether a record can be returned is GetRecord.
> The second sentence (reading "This status...") notes that if 
> the status is deleted, then no record can be returned. So we 
> already have a definitive answer don't we? 
> Digging further I see that the GetRecord schema notes that a 
> record *can* be returned with a deleted status -- which seems 
> contrary to the above? 
> However if an identifer doesn't exist then no record will be returned in 
> the GetRecord response.
> So is it the case that records for items with a deleted status will always 
> be available (i.e. the metadata is can still be harvested) but 
> after a period (determined by the archive) GetRecord may 
> subsequently return no metadata.
> Thanks in advance,
> L.

hussein suleman - hussein@vt.edu - vtcs - http://www.husseinsspace.com