[OAI-implementers] Friends and Neighbours scheme ... locating OAI repositories

Xiaoming Liu liu_x@cs.odu.edu
Sat, 2 Mar 2002 21:23:23 -0500 (EST)


Several replies combined in one message.

On Sat, 2 Mar 2002, Alan Kent wrote:

> Another mailing list I am on related to Z39.50 is talking about the
> concept of "Friends and Neighbours". It may be a bit early for
> OAI (not that many sites yet!), but the basic idea is that sites
> keep track of what other sites they know of. That way, there does

This looks very like the concepts in Peer-to-Peer, especially Gnutella.
But I am not sure how well it fits in here. In Gnutella/Fasttrack, people
want to get a music file, they don't care  much about how precise the
result is, as long as they can get something. But here we may want a
definite answer, like: how many OAI data providers and what contents they
share so far?  

Another way is to expose the list of data provider in an OAI interface,
that's what we did in Arc:

http://arc.cs.odu.edu:8080/oaisp/servlet/OAI-SP?verb=ListRecords&from=&until=&set=&metadataPrefix=oai_dc

This might be an easier way to implement a friend/neighbor functions. For
a data provider, it might support two OAI interface, one for naming
service, another for real data.

On Sat, 2 Mar 2002, Steven Bird wrote:
> I favor a central registry to control the oai:... namespace.  

I agree a strong central control will be helpful, considering somebody
wants to build a service provider, he really needs a point to start with.
Of course, that won't prevent any internal use. 

On Sat, 2 Mar 2002, herbert van de sompel wrote:
> manner.    However, the OAI-Tech committee has decided not to pursue it,
> mainly because we could not figure out what the incentives would be for
>a repository to do the additional work of keeping a list of "befriended"
> repos.

One incentives will be the peer-2-peer like Kepler ;-), in the scenario of
Kepler, each client is very unstable, so their contents must be cached in
somewhere, it might be in the service provider side, like we did now. Or
it might be distributed to its neighbor or anybody who is more stable or
has better Internet communcation. So it would be helpful if such a list
of "befriended" exists. That's the the approach we are exploring.

Another incentives will be kind of mirror problem like oaia, for one
service provider, it doesn't want to harvest same records multiple times
from different data providers. So one way to identify the duplication is
necessary, I believe that's also where a "befriended" verb will play.


Regards,
liu