[OAI-implementers] OAI identifier resolver

Xiaoming Liu liu_x@cs.odu.edu
Mon, 20 Oct 2003 21:40:56 -0400 (EDT)

On Mon, 20 Oct 2003, Lonnie D. Harvel wrote:

> I am in favor of just the URL:[collection name] approach.  Why make it
> more complicated than necessary? URL's are unique. Is there a particular
> reason why it needs to be shorter?

This is back to the problem why we need a resolver. If both baseURL and
record identifier are supplied, it doesn't make a lot sense to develop a
resolver. I think the motivation is to provide a "cool" URL for each
record, and make it easy to exchange information by REST model.

OAI has no centralized mechanism to maintain unique repository name, it's
either done by one centralized registry -- like UIUC registry, or done
by a distributed way -- like hashing baseURL or other better ways. In the
distributed way, I can add a link to Purl-OAI resolver without prior
knowledge of how repository name is maintained in Purl-OAI resolver.
That's my reason of favoring distributed method.


> Adam Farquhar wrote:
> > Xiaoming,
> >
> > Selecting an approach that will be certain to fail, but unpredictably,
> > is not a good 'engineering' approach, especially when there are other
> > approaches that do not fail.  For example, taking a base64 encoding of
> > the base URL or just using the base URL itself will both provide a
> > unique identifier.
> >
> > Adam.
> >
> >>>Hash algorithms such as MD5 or CRC32 cannot be used to generate unique
> >>>identifiers.  These algorithms will occasionally produce the same output for
> >>>different input strings (this is why hash tables require a mechanism for dealing
> >>>with collisions).  Common approaches to generating unique identifiers use some
> >>>sort of a registration mechanism to appropriately partition the space of possible
> >>>values.  Successful ones will leverage an existing registration mechanism, such
> >>>as DNS.
> >>>
> >>>
> >>
> >>I agree hash algorithm is not a "perfect" way to generate unique
> >>identifier for a repository, but it may be acceptable in engineering
> >>perspect, the collision possibility will be pretty low in current scale of oai data
> >>providers (<500?).
> >>
> >>I think the basic problem is how to render OAI baseURL to a shorter,
> >>readable string in non-collision way. The algorithm should be repeatable
> >>-- Anyone can use same algorithm to generate same output given a baseURL.
> >>I will be glad to see other approaches.
> >>
> >>
> >>
> > _______________________________________________ OAI-implementers
> > mailing list List information, archives, preferences and to
> > unsubscribe:
> > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers