[OAI-implementers] harvester guidelines

Jasper Op de Coul opdecoul at ubib.eur.nl
Thu May 26 11:26:49 EDT 2011

Hi Samuele,

On 05/26/2011 02:05 PM, Samuele Kaplun wrote:
> Hi Jasper,
> Il giorno gio, 26/05/2011 alle 12.43 +0200, Jasper Op de Coul ha
> scritto:
>> 3. Use incremental harvests, but never use the ?set param. The client
>> will receive all records and can inspect the SetSpec header manually to
>> see if this record is part of the wanted set. Records that are not part
>> of the wanted set but are in the client database can be removed.
> this sounds like a nice idea, but it would not fully address the case
> when, in the repository, the union of all sets, is just a subset of the
> whole record universe. If a record gets out of a set and don't get into
> any other set, then it will not be deleted, but it won't as well be
> exported, in the case where the set param is not specified. So
> unfortunately even with your solution this situation would not be
> solved :-(

I'm not sure if I follow you correctly. Do you mean that records
wouthout any setspec never show up in the feed? I don't think this is
the case. Maybe you mean that if only the setspec changes but not the
metadata, then it could be that the datestamp is not updated?

> Moreover by harvesting without specifying a set, you are putting
> (theoretically) more load, not only on the client, but also on the
> server, since you are asking way more information that is going to be
> thrown away afterwards.

Yes, but you can keep doing incremental harvests instead of throwing
everything away and doing a full reharvest every month. So it is not
that clear which scenario consumes the most bandwith.

> I can not wait to see the outcome of the "Next Generation OAI-PMH"
> Technical Session at OAI7
> <http://indico.cern.ch/contributionDisplay.py?contribId=21&confId=103325>
> where I think this topic will be very well addressed.

Ah that sounds very interesting indeed. I wont be attending OAI7 this
time since I opted for the EuroPython conference in Florence, which is
in the same week.. I'll suggest the talk to my colleague who is going.

> Cheers!
> 	Samuele

Jasper Op de Coul -- Erasmus University Rotterdam
t +31 10 4082922  -- http://eur.nl/ub
Burgemeester Oudlaan 50 3062 PA Rotterdam -- The Netherlands

De informatie  verzonden in dit e-mail bericht  inclusief de bijlage(n) is
vertrouwelijk  en is  uitsluitend  bestemd  voor de geadresseerde  van dit
bericht. Lees verder: http://www.eur.nl/email-disclaimer

The information in this e-mail message  is confidential and may be legally
privileged. Read more: http://www.eur.nl/english/email-disclaimer

More information about the OAI-implementers mailing list