[OAI-implementers] virtual data provider

Steven Bird Steven Bird <sb@ldc.upenn.edu>
Wed, 27 Jun 2001 22:39:38 EDT


Many potential data providers in the Open Language Archives Community
(OLAC) have just a handful of records, and it is not worth them setting up
their own full-fledged data provider.  We decided not to use Kepler since
have our own metadata set and since we wanted platform-independence.

Recently, Eva Banik, a programmer at the LDC implementing OLAC
infrastructure, created a (prototype) virtual data provider which
"harvests" XML files from a URL and provides a regular OAI data provider on
top.

There is one XML file per OLAC record, plus two extra files:
- "identify", the response to the identify request
- "identifiers", the list of filenames
(another way would have been to pack everything in a single file).

An example of the set of files is at:
  http://www.ldc.upenn.edu/OLAC/dp/data/

The virtual data provider is at:
  http://wave.ldc.upenn.edu/OLAC/dp/vdp.php4

When supplied with some extra pathinfo it will behave like a regular data
provider:
  http://wave.ldc.upenn.edu/OLAC/dp/vdp.php4/wave.ldc.upenn.edu/OLAC/dp/data/

And you can test it out with the repository explorer at:

  http://rocky.dlib.vt.edu/~oai/cgi-bin/Explorer/oai1.0/testoai?archive=http://wave.ldc.upenn.edu/OLAC/dp/vdp.php4/wave.ldc.upenn.edu/OLAC/dp/data/

The PHP code can be accessed at:
  http://www.ldc.upenn.edu/OLAC/dp/vdp.php4

We plan to help users create small record sets either with a simple CGI
program (where they fill in a form on the browser, submit it, and get back
the file they need to store), or else software which generates an editor
given a schema.

We'd welcome any feedback or advice.

Yours,
-Steven

P.S. For more information on OLAC, please see www.language-archives.org

--
Steven.Bird@ldc.upenn.edu  http://www.ldc.upenn.edu/sb
Assoc Director, LDC; Adj Assoc Prof, CIS & Linguistics
Linguistic Data Consortium, University of Pennsylvania
3615 Market St, Suite 200, Philadelphia, PA 19104-2608