[OAI-implementers] Selective Harvesting and Datestamps

Simeon Warner simeon at cs.cornell.edu
Thu Jun 9 09:04:59 EDT 2005



On Thu, 9 Jun 2005, Martijn Faassen wrote:

> Hi there,
>
> I'm implementing an OAI server and am wondering about the details of
> selective harvesting in date ranges.
>
> The spec says:
>
>    modification - the response *must* include records, corresponding to
>    the metadataPrefix argument, which have changed within the bounds of
>    the from and until arguments.
>
> Does this mean that records need to track their full history of modified
> dates? Just tracking their last modified datestamp does not seem enough
> to fully comply with this, as a record modified on 2005-04-10 and then
> again on 2005-06-05 would not show up in the range 2005-03-01 -
> 2005-05-01, as it would only be known it was modified 2005-06-05.

It was intended to mean only the most recent changes. Thus, only the last
modification date for each record needs to be recorded. Looking at the
spec I see it is only implied that records have a single datestamp
associated with them:

spec> ... A repository must update the datestamp of a record if a change
spec> occurs, the result of which would be a change to the metadata part
spec> of the XML-encoding of the record. Such changes include, but
spec> are not limited to, changes to the metadata of the record, changes
spec> to the metadata format of the record, introduction of a new
spec> metadata format, termination of support for a metadata format, etc

> On the other hand, there seems to be no requirement that a modified
> record gets exposed in its original, historical state; i.e. historical
> revisions of metadata do not need to be retained to comply. This means
> that in fact someone harvesting between 2005-03-01 - 2005-05-01 would
> see all records in the most recent state anyway, thus including the
> 2005-06-05 change.

Records are only ever exposed in their current state, a complete
modification history is not required.

> It therefore seems that for incremental date-based harvesting to work,
> full historical information about modification dates is not strictly
> necessary, as at the end of a full harvest throughout the full date
> range all the data *will* be correctly sent to the harvester..
>
> Tracking a full history of modification dates for all records seems like
> an onerous requirement. Is this really the intent? Does it really hurt
> if only last-modified dates are retained?

Correct, only the last-modified dates need to be maintained.

Cheers,
Simeon

> Am I misreading the spec?
>
> Regards,
>
> Martijn
>
>
> _______________________________________________
> OAI-implementers mailing list
> List information, archives, preferences and to unsubscribe:
> http://www.openarchives.org/mailman/listinfo/oai-implementers
>



More information about the OAI-implementers mailing list