[OAI-general] Dealing with different type of data provider
Fri, 13 Jul 2001 18:08:13 -0400 (EDT)
It's interesting that we are doing similiar work in kepler project.
(http://www.dlib.org/dlib/april01/maly/04maly.html), the approach of
kepler is like:
1. We maintain a centralized registration server (LDAP).
2. Whenever user installs the unstable data provider( we call it
archivelet), the user will register a unique name in LDAP.
3. Whenever user starts/stops, archivelet will notify the LDAP server.
4. The service provider will periodically query the LDAP server, get all
the active archivelet list, and do the metadata harvesting, besides,
service provider will also cache full-text document.
5. The end-user could search service provider, get the cached full-text
document, or get the original document from archivelet if it's active.
6. The service provider is also a data provider too, and could register
it in OAI website (but archivelet won't). it could expose the
harvested metadata. However, it will assign new OAI identifier to
harvested data, thus it won't arise any naming conflict in OAI naming
We are also interested in the "notify/subscribe" model as you mentioned,
basically OAI protocol is "pull" model based, your solution is like kind
of mixture of "push" and "pull" model, and there are probabaly another
pure "push" model- data provider sends its update to service provider
directly. It willl be very interesting to study the trade-off of these
approaches and their influences to OAI protocol.
On Fri, 13 Jul 2001, Antonucci, Robert wrote:
> Sorry for the cross posting. I have have two specific questions, one for
> the OAI General and one for the OAI Implementors.
> We are creating a Digital Library of Science Education data, however we are
> following a different approach than many digital libraries. We are
> targeting amateur astronomers and grade schools and places that will have a
> small number (10 to 40) of data items. They will run an out of the box data
> provider package. We will use OAI to harvest their metadata and provide a
> web page to search all these smaller sites. This was intended to have a
> Napster-like architecture.
> The problem is that being smaller, more informal sites they probably will
> not be 24x7 systems. In fact, we expect them to fail unexepectedly and go
> offline frequently for potentially long periods.
> The question for the OAI general is, should less reliable sites like the
> ones we are targeting still register with the OAI data provider list? Or is
> there some implied quality of service with respect to uptime? We need a
> mechanism for giving each data provider a unique ID and the OAI Repository
> Identifier would be perfect, but requires joining the list.
> The question for the OAI Implementors is, we are creating a set of tools
> specifically to deal with the unreliability of the sites. We are using
> using Jini to setup a dynamic registration and notification system. This
> way our service provider will get a list of all data providers THAT ARE
> CURRENTLY UP. Also, the servive providers will be notified if a data
> provider comes up or goes down. It will also be notified that a data
> provider has changed its metadata and a harvest request should be sent. Is
> this toolkit something that would be of interest to other people/projects?
> Robert Antonucci
> JOINed Digital Library
> NASA GSFC
> OAI-general mailing list