I've got some related questions/thoughts on this topic...

> I was wondering what the exact position was with respect to OAI and
> Dublin Core. For example, the OAI spec defines oai_dc and an XML Schema
> specification for it. My question relates to other (non-Dublin Core)
> elements that may be added.
> For example, various groups have defined their own additional elements
> and what they mean. They put these additional elements in their own
> namespace. I am guessing it is not correct for me to include these
> additional elements if oai_dc is requested (I should prune them out).

I guess this depends on how authoritative the XML Schema for the DC 
metadata is deemed to be?

The main schema defines the metadata block for a record as being 
optional (minoccurs="0") so you can't guarantee that metadata 
will be provided with a GetRecord response. When it is included, the 
oai:metadata element can contain *any* number of of elements, 
from any namespace. So one might assume that any amount of metadata 
is permissible within this element.

The DC schema allows for any amount of metadata from a specific 
list of elements (subject, creator, etc), although including none of them is 
also valid. So from this one might assume that it is only legal to 
include elements within the oai:metadata/dc:dc elements that are explicitly 
given in this schema.

Although as the above shows, other elements e.g. oai:metadata/foo:foo 
would be legal -- this is probably not the intent though.

> Each group should then define their own oai_agls etc identifier

I personally think this is the better option. If the oai_dc metadata 
response is guaranteed to contain a well-defined metadata format, 
then the goal of interoperability seems to be met.

The intent of the spec seems to be that additional, custom metadata formats 
should be only delivered in response to requests for an explicit 

> (probably without the oai_ prefix, if that is reserved for use by
> the protocol).

I think the spec only restricts oai_dc rather than oai_*

On a related note, I'm curious whether service providers are routinely 
validating the records they harvest from repositories?

Also are there any standard subject classifications being used, or 
are data providers creating their own?



