[OAI-implementers] How to verify a download worked?

Hussein Suleman hussein@vt.edu
Mon, 04 Mar 2002 10:52:17 -0500


also, i should mention that we expect OAI-PMH v2.0 will have a "full 
list size" field of sorts that will let you know how many records there 
are in the full set.


Simeon Warner wrote:

> Alan,
> The reason you get just 60k records from arXiv is probably linked with the
> problem of specifying a date too early for my implementation to understand
> correctly (now fixed, someone else pointed it out last week too). I don't
> know about ways to verify successful harvesting but I would suggest that
> doing a harvest with no 'from' and 'until' parameters is more robust than
> picking an arbitrary 'from' date.
> Cheers,
> Simeon.
> On Mon, 4 Mar 2002, Alan Kent wrote:
>>Hi All,
>>I was wondering if anyone has good schemes for verifying if a download
>>of metadata 'worked'. For example, I crawled the arXiv site and got
>>around 60,000 records. However, it turns out the site actually has
>>190,000 or so records. So I only got 1/3 of the site!
>>Has anyone used any clever tricks to verify how well a crawl worked?
>>I now have to work out if my crawler has been discarding one in three
>>records! :-(
> _______________________________________________
> OAI-implementers mailing list
> OAI-implementers@oaisrv.nsdl.cornell.edu
> http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers

hussein suleman - hussein@vt.edu - vtcs - http://www.husseinsspace.com