[OAI-implementers] returning *data* (as opposed to metadata)

Michael L. Nelson mln@ils.unc.edu
Wed, 25 Jul 2001 11:30:37 -0400 (EDT)


On Wed, 25 Jul 2001, Ben Henley wrote:

> 
>  Hi there,
> 
>  What if I decided to have a format that would return full data rather than
> metadata? We have a database of scientific articles which are stored as XML
> (mean size 60KB) and rendered as HTML on our site. But what if I decided
> that one of the "metadata formats" returned by our OAI interface was the
> full XML of an article?

data, metadata -- who can tell the difference?  ;-)

seriously, I don't think this is technically a problem.  in addition to
having a separate "format", you might also consider:

1.  partitioning using sets -- "metadata only" and "md + d" for example.

2.  having two OAI interfaces -- one for metadata only, and the other for
full data (the URI in the former could actually be a GetRecord verb into
the latter).  this approach is probably even better considering the IP
"restrictions" you have.

or some hybrid of these 3 approaches.

> 
>  Would a 60KB+ Record be likely to cause harvesters to choke? What other

set your resumptionToken to be low (couple of records).  since
resumptionTokens are not required, harvesters ought to be able to handle
big responses.  and if they can't handle 60kb, that's not entirely your
problem.  on the other hand, a 6MB+ response might be rude.

> implications (implementation or otherwise) would doing this have? Would
> anyone use it? (Bear in mind these articles are open access, but still
> copyrighted by the authors so you couldn't harvest them and use them
> commercially).  Is this a sane thing to consider?

I can imagine many SPs would want to do this.  I also expect many SPs to
do this even with non-XML data (i.e., PDFs) by getting the metadata
records and then immediately extracting the content from them.  For
example, the ResearchIndex and Southampton folks that focus on extracting
citations from the actual content might do this.

regards,

Michael

> 
> 	Thanks,
> 
> 	Ben
> 
> Ben Henley <mailto:ben@biomedcentral.com>                    
> Usability Engineer
> BioMed Central    
> http://www.biomedcentral.com
> 
> 
> 
> _______________________________________________
> OAI-implementers mailing list
> OAI-implementers@oaisrv.nsdl.cornell.edu
> http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers
> 

---
Michael L. Nelson			
207 Manning Hall, School of Information and Library Science
University of North Carolina 		mln@ils.unc.edu
Chapel Hill, NC 27599			http://ils.unc.edu/~mln/
+1 919 966 5042				+1 919 962 8071 (f)