[OAI-implementers] Minor annoyance - what is the official name of an OAI site?

Alan Kent ajk@mds.rmit.edu.au
Thu, 31 Oct 2002 18:55:30 +1100


One minor annoyance (that I think I have reported before) is that the
list of OAI data providers on the www.openarchives.org site does not
list a 'id' for all bases in the XML document returned.

I then started thinking, what was the id attribute? Is it the repository
name, repository identifier, or just some other id that people decided
to type in when registering the repository? (I think its the latter)

So I thought, well OAI 2.0 has the <oaiIdentifier> description stuff now,
so I can go to a site and work out its identifier. The problem is not
everyone supports it. I just went to AIM25 for example (because
it was alphabetically at the start of the list). It returns a
repository name with spaces

    <repositoryName>AIM25 - Archives in London</repositoryName>

but no repository identifier. Doing a ListRecords showed me the
repository identifier I think is aim25.ac.uk, but I guessed this
as a human by looking at the first record that came back.
Records in a repository should keep the identifier of the original
record in the case of an aggregator, so this is not a reliable
approach to use.

Is it mandatory that all repositories have a 'repository identifier'?

Is it mandatory that Identify for OAI 2.0 make the identifier available?

Should the list on the open archives web site be updated to make sure
it has the correct repository identifiers for all sites?

I know I can go look up the spec, but I am trying to be provocative
here and elicit responses like "no, but it should be" or "don't be silly".
Do aggregators (who just get other people's data) have repository
identifiers even if they don't have any of their "own" content?

I guess my bottom line is that I think the page on the open archives
web site would be better if it included the official repository identifiers
for each registered data provider. I can write a script to generate my
own XML document (get all the URLs, do an Identify - if not good enough,
do a ListRecords).

I guess I am also encouraging people to go to the effort of including
the <oaiIdentifier> description in their OAI data provider implementations

Maybe its just me being pedantic. We have tried to automate the updating
of our list of sites to harvest (for interop testing), but it keeps
getting duplicates.