[OAI-implementers] Harvesters

Young,Jeff jyoung@oclc.org
Tue, 3 Apr 2001 12:39:22 -0400


Lesli,

I wrote an OAI harvester in Java. An overview is available at
http://alcme.oclc.org:4342/OAIHarvester.html. I'll package it into a jar
file and make it available from that page today or tomorrow. There's not
much documentation on it yet, but I'd be happy to help you get it set up for
your purposes.


Jeffrey A. Young
Senior Consulting Systems Analyst
Office of Research, Mail Code 710
OCLC Online Computer Library Center, Inc.
6565 Frantz Road
Dublin, OH   43017-3395
www.oclc.org

Voice:	614-764-4342
Fax:		614-764-2344
Email:	jyoung@oclc.org




> -----Original Message-----
> From: Lesli [mailto:lesli@aztec.lib.utk.edu]
> Sent: Monday, April 02, 2001 1:37 PM
> To: Tim Brody
> Cc: oai-implementers@oaisrv.nsdl.cornell.edu
> Subject: Re: [OAI-implementers] Harvesters
> 
> 
> Are there any other OAI harvesters out there besides the one 
> that Hussein has (or is
> harvester not the correct term for what Hussein has)?  We 
> were wondering how many
> there might be and how to check them out.
> 
> Thanks.
> Lesli Zimmerman
> Sr. Metadata Specialist
> University of Tennessee
> 
> 
> Tim Brody wrote:
> 
> > On Mon, 2 Apr 2001, herbert van de sompel wrote:
> >
> > > > >From a harvesters point of view (I don't believe there 
> are many of us :-), I
> > > > would prefer to have "oai_dc" because that tells me 
> explicitly what data I
> > > > can expect to find, rather than having to remember what 
> I requested (as far
> > > > as I can tell it is the one part that makes an isolated 
> OAI response
> > > > stateful, a real pain if one is using caching or other systems).
> > > >
> > >
> > > The metadataPrefix is only signficiant within the realm 
> of a certain
> > > repository.  The only exception is the metadataPrefix 
> oai_dc, which --
> > > by convention -- refers to metadata expressed in 
> unqualified Dublin Core
> > > in all repositories.
> >
> > > * what REALLY tells you which metadata you receive is the 
> namespace:
> > > xmlns="http://purl.org/dc/elements/1.1/" .  that is a 
> global identifier
> > > of the format.
> >
> > Agreed, I guess when using OAI responses I need to delve 
> into the XML
> > schema identification to be correct ...
> >
> > > * in addition to that, you do not have to "keep" the 
> format you asked
> > > for: all response are self-contained, meaning you can 
> tell from the
> > > original protocol request -- which is the content of the 
> requestURL
> > > element -- what you asked for.  this has been a 
> deliberate choice, since
> > > we are indeed talking about robots harvesting metadata, and some
> > > software processing the harvested metadata at a later stage.
> >
> > I have thought of this, but the protocol specifically discards this
> > information when using resumptionTokens (I, personally, 
> don't like the
> > idea of exclusive variables vs the repository doing 
> something intelligent
> > when asked something stupid - introduces state information 
> that isn't
> > naive to implement on either side).
> >
> > Of course, the namespace makes this immaterial.
> >
> > Thanks for the quick response,
> > Tim.
> >
> > _______________________________________________
> > OAI-implementers mailing list
> > OAI-implementers@oaisrv.nsdl.cornell.edu
> > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers
> 
> _______________________________________________
> OAI-implementers mailing list
> OAI-implementers@oaisrv.nsdl.cornell.edu
> http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers
>