[OAI-implementers] Do you have instruction for ad hoc harvesters?

Caroline Arms caar at loc.gov
Sun Nov 25 12:29:29 EST 2007


Thanks for your suggestion about hiding the resumption token complexity from a user via an HTML front end.  Unfortunately, it's not that simple for a variety of reasons, one of which is scale.  A popular set has almost 200,000 records.   When the harvester has to use the resumption token process, that can be done without affecting performance on the applications that are on the same server and using the same data.  We don't want to create a short cut that causes problems for our regular users and harvesters, just to help a few requesters (who are already looking for records to batchload into a local database and are assumed to be savvy about data wrangling in their local environment) through steps that get them the records they want in the format they want.

    But thanks again for your suggestion.  

    Caroline Arms             caar at loc.gov

>>> Conal Tuohy <conal.tuohy at vuw.ac.nz> 11/22/07 9:19 PM >>>
On Thu, 2007-11-22 at 12:38 -0500, Caroline Arms wrote:
> At the Library of Congress we quite often get requests for the records for a collection of digitized historical materials from entities outside the library or digital repository community.    Typically, these are organizations that want to integrate a collection of photographs into an internal system for a particular project.  An example would be the production team for a TV documentary assembling an internal collection of records and images relevant to the topic to use as the basis for selection for use in the production and tracking of associated workflow.  When pointed at the OAI site, they are mystified.
> I am wondering whether we can create a quick how-to document tailored to this particular task that makes no assumptions about the technology at the other end.
> Does anyone have or know of a brief introduction aimed at someone who only needs to know enough about OAI-PMH to get the records for an entire set given its setSpec (having to deal with resumption tokens), but may need to be told soem other things, such as:
>   *  they will have to understand enough about the semantics of the metadata formats available to select the right metadata prefix 
>   *  they will probably need XML tools to transform the records into something compatible with their local system
>   *   etc., etc.
> If you have something like this written that has been used successfully, and would be prepared to share it, we would love to see it.

This is perhaps a bit tangential, but might be worth considering...

I've seen some OAI repositories which serve up their content with
<Xxml-stylesheet?> processing instructions referring to XSLT stylesheets
which convert the XML into nice HTML pages which provide a nice friendly
user interface. You don't know you're dealing with an OAI server at all!
These XML processing instructions are ignored by real OAI harvesters,
but they are respected by ordinary web browsers, and they could provide
users with a simple way to select metadata formats, navigate through
sets, follow resumption tokens, etc, etc.

For instance, here's one here:



Conal Tuohy
New Zealand Electronic Text Centre

OAI-implementers mailing list
List information, archives, preferences and to unsubscribe:

More information about the OAI-implementers mailing list