[OAI-implementers] Identifiers [was: Re: OAI-PMH + IEEE LTSC LOM]

Andy Powell a.powell@ukoln.ac.uk
Thu, 11 Mar 2004 23:58:03 +0000 (GMT Standard Time)


On Thu, 11 Mar 2004, Chris Hubick wrote:

> In a repository that harvests from a number of different systems through
> a variety of protocols, and has identifiers from many catalog types (not
> necessarily URI's)...
>
> How does one map an arbitrary catalog/entry *pair*, to a *single*
> identifier string?
>
> My answer was to use a URN:
>
> 'urn:' + <catalog> + ':' + <entry>

One problem with this approach is that there is presumably very little
consistency across services in the way that 'catalog' is assigned - i.e.
the 'catalog' is not taken from a controlled vocabulary.  So although you
end up with a single single string identifier (the URN) you don't really
have a mechanism for reliably comparing URNs from different sources.

It seems to me that the 'catalog'/'entry' pairing in LOM is a bit broken
- because it really requires a global registry of 'catalog' names to work
properly.  (At least, without a global registry I can have no way of
knowing if your 'catalog' is the same as my 'catalog').  URIs already
provide a global space within which new identifier schemes can be created
- why not use it, rather than building a LOM-specific registry.

In partricular, the proposed 'info' URI scheme

http://info-uri.info/registry/docs/misc/faq.html

provides an open mechanisn for assigning URIs to information assets that
have identifiers in public namespaces but have no representation within
URI space.

> Has anyone else tackled this problem?

Not really, but you might be interested in

Guidelines for encoding identifiers in Dublin Core and IEEE LOM metadata
http://www.ukoln.ac.uk/metadata/dcmi-ieee/identifiers/

which basically suggests that URIs should *always* be used.

Andy
--
Distributed Systems, UKOLN, University of Bath, Bath, BA2 7AY, UK
http://www.ukoln.ac.uk/ukoln/staff/a.powell       +44 1225 383933
Resource Discovery Network http://www.rdn.ac.uk/