[OAI-implementers] DP9- An OAI Gateway Service for Web Crawlers

Xiaoming Liu liu_x@cs.odu.edu
Wed, 21 Nov 2001 00:18:12 -0500 (EST)


Hi all,

(appologies for cross-posting)

A new OAI service provider for Web Crawlers- DP9 is available, the idea
comes from one discussion in this list -- how to index OAI archives in
Google? 
 
DP9 is a gateway service that enables indexing of an OAI data provider by
an Internet search engine. The DP9 allows a web crawler to retrieve
records in an OAI collection by executing OAI requests and translating XML
responses into HTML format on behalf of a web crawler. 
 
Below are the services that DP9 provides:
 
An entry page,if Web Crawler find entry page and dig into these links, it
will index all records in an OAI data provider. 
 http://arc.cs.odu.edu:8080/dp9/index.jsp
 
Persistent and bookmarkable URL for OAI record. An example,     

 http://arc.cs.odu.edu:8080/dp9/getrecord.jsp?identifier=oai:arXiv:astro-ph/9501031&prefix=oai_dc
 
Parallel metadata Set, but only limited format is supported now,  new
metadata support could be easily added-- just send us your XSL file

http://arc.cs.odu.edu:8080/dp9/getrecord.jsp?identifier=oai:VTETD:etd-3345131939761081&prefix=oai_rfc1807
 
The DP9 code is available from 
   http://arc.cs.odu.edu:8080/dp9/install.jsp
It's based on JSP and XSLT, if you install it in your own server, it will
make your OAI compliant archive webcrawler-enabled, and with your own URL.
 
DP9 is a gateway service, it doesn't cache the OAI record and just
forwards any request to corresponding OAI data provider, so its quality of
service is highly depended on the server availabity of OAI data providers.
 
DP9 now uses the data providers list from OAI website
 http://www.openarchives.org/Register/ListFriends.pl

We'd welcome any feedback or advice.


Xiaoming Liu
DL Research Group
Old Dominion Univ