FW: [OAI-implementers] Open Archives Initiative Protocol for Meta data Harvesting Version 2 news

Walter Underwood wunder@inktomi.com
Wed, 06 Feb 2002 09:09:34 -0800


--On Tuesday, February 5, 2002 12:27 PM -0500 "Young,Jeff" <jyoung@oclc.org> wrote:
>
> Of course, resumptionTokens don't guarantee that an arbitrary data provider
> will return a complete set of results. They merely provide a mechanism to
> make it possible. Without such a guarantee, harvesters are obliged to
> periodically reharvest the entire repository if they want to pick up those
> missed items.

They have to do that anyway, since deleted record responses are
not guaranteed. For complete garbage collection, they need to
check all items to see if they still exist.

The list interfaces are mostly needed for new items. We don't mind
if the list is inconsistant or unsynchronized, as long as it has
all the new stuff.

There is one thing that the list should never do: include something
before it is available via getRecord. Our spider can check something
within a second, and we've run into problems with systems which
notify the spider, then publish the content. The spider checks,
gets a 404, and goes on to something else. Publish, then notify.

wunder
--
Walter R. Underwood
Senior Staff Engineer
Inktomi Enterprise Search
http://search.inktomi.com/