[OAI-implementers] Selective Harvesting and Datestamps
simeon at cs.cornell.edu
Thu Jun 9 09:04:59 EDT 2005
On Thu, 9 Jun 2005, Martijn Faassen wrote:
> Hi there,
> I'm implementing an OAI server and am wondering about the details of
> selective harvesting in date ranges.
> The spec says:
> modification - the response *must* include records, corresponding to
> the metadataPrefix argument, which have changed within the bounds of
> the from and until arguments.
> Does this mean that records need to track their full history of modified
> dates? Just tracking their last modified datestamp does not seem enough
> to fully comply with this, as a record modified on 2005-04-10 and then
> again on 2005-06-05 would not show up in the range 2005-03-01 -
> 2005-05-01, as it would only be known it was modified 2005-06-05.
It was intended to mean only the most recent changes. Thus, only the last
modification date for each record needs to be recorded. Looking at the
spec I see it is only implied that records have a single datestamp
associated with them:
spec> ... A repository must update the datestamp of a record if a change
spec> occurs, the result of which would be a change to the metadata part
spec> of the XML-encoding of the record. Such changes include, but
spec> are not limited to, changes to the metadata of the record, changes
spec> to the metadata format of the record, introduction of a new
spec> metadata format, termination of support for a metadata format, etc
> On the other hand, there seems to be no requirement that a modified
> record gets exposed in its original, historical state; i.e. historical
> revisions of metadata do not need to be retained to comply. This means
> that in fact someone harvesting between 2005-03-01 - 2005-05-01 would
> see all records in the most recent state anyway, thus including the
> 2005-06-05 change.
Records are only ever exposed in their current state, a complete
modification history is not required.
> It therefore seems that for incremental date-based harvesting to work,
> full historical information about modification dates is not strictly
> necessary, as at the end of a full harvest throughout the full date
> range all the data *will* be correctly sent to the harvester..
> Tracking a full history of modification dates for all records seems like
> an onerous requirement. Is this really the intent? Does it really hurt
> if only last-modified dates are retained?
Correct, only the last-modified dates need to be maintained.
> Am I misreading the spec?
> OAI-implementers mailing list
> List information, archives, preferences and to unsubscribe:
More information about the OAI-implementers