[OAI-implementers] OAI identifier resolver
Mon, 20 Oct 2003 13:42:39 -0400 (EDT)
On Mon, 20 Oct 2003, Adam Farquhar wrote:
> Hash algorithms such as MD5 or CRC32 cannot be used to generate unique
> identifiers. These algorithms will occasionally produce the same output for
> different input strings (this is why hash tables require a mechanism for dealing
> with collisions). Common approaches to generating unique identifiers use some
> sort of a registration mechanism to appropriately partition the space of possible
> values. Successful ones will leverage an existing registration mechanism, such
> as DNS.
I agree hash algorithm is not a "perfect" way to generate unique
identifier for a repository, but it may be acceptable in engineering
perspect, the collision possibility will be pretty low in current scale of oai data
I think the basic problem is how to render OAI baseURL to a shorter,
readable string in non-collision way. The algorithm should be repeatable
-- Anyone can use same algorithm to generate same output given a baseURL.
I will be glad to see other approaches.
> Adam Farquhar.
> Xiaoming Liu wrote:
> On Mon, 20 Oct 2003, Young,Jeff wrote:
> - My hope is that these URLs will be as natural-looking as possible, which
> is why I'm advocating the assignment of meaningful repositoryIdentifiers
> during the registration process, even for repositories that don't use the
> oai-identifier schema.
> I think we all agree it's useful to uniquely identify a repository and all
> its records in an URL-friendly way. Thus different service providers and
> data providers can easily interoperate.
> I just personally feel it's probably easier to agree on an algorithm than
> a centralized registration mechanism. MD5 generated fingerprint is probably
> too long, but other hashing algorithms (like CRC32) can generate much
> shorter signature.
> _______________________________________________ OAI-implementers mailing list
> List information, archives, preferences and to unsubscribe: