[OAI-implementers] ABOUT & Schemas & metadataPrefix

Hussein Suleman hussein@vt.edu
Sun, 01 Apr 2001 12:45:50 -0700


Tim Brody wrote:
> I've been playing around with the validator and have found it fails with my
> output, which contains an empty ABOUT section (i.e. just <about></about>).
> From my understanding of the protocol I should be able to have an empty
> about (my logic was that if I was going to make use of this in future I
> would want any harvesters to be able to handle about correctly, so an empty
> tag was better than no tag at all).

i guess your XML looks perfectly fine from a reasonableness viewpoint
... the only reason it doesnt validate is that our schema definition
does not currently allow it ... and thats because we didnt put in
min/max constraints on the about tag (hence defaulting to a single
required tag)

we currently have:
   <any namespace="##any" processContents="lax"/>

and we ought to have:
   <any namespace="##any" processContents="lax" minOccurs="0"

so i dont really know what happens now ... can this be fixed, Herbert ?
will too much break if we do this ? or does it get added to our list of
issues for the next iteration ?

> Does anybody know of a "Howto" for writing OAI schemas, I really don't want
> to spend hours trawling through the incomprehensibility of w3c just to make
> a few tweaks?

OAI schemas are not particularly special ... so you can just look at
whats generally out there on schemas (which, alas, isnt much) ... ive
found that reading and working from the "XML schema part 0: primer" is
sufficient ... 

there are one or two 3rd-party tutorials on the w3c site, but for
quick-and-dirty tutorials, i havent seen any yet ...

> Inside the metadata tag is the metadata format tag. Should the name of this
> tag be derived from the schema or as the harvester requested (i.e.
> metadataPrefix)?

it doesnt really matter as far as the protocol goes ... 

by your argument of self-containment, it makes a lot of sense to use the
metadataPrefix for the root element ... this issue is something we've
been debating recently in the context of NDLTD and while we didnt
resolve it, we raised some interesting questions:
- do we write schemata specifically for OAI or should our metadata
schemata be generally applicable ? if the latter, and an OAI metadata
scheme is prefixed with "oai_", we simply cannot encode this into our
schemata ! what do we do then ?
- is there a philsophical difference between a metadata set and the
metadata generated when a particular set is requested ? ie, can i
request "thesis or dissertation" records and receive some XML rooted
with "thesis" and others rooted with "dissertation" ? or do we use
containers ?

should we tag on this "oai_dc vs dc" issue as one for future
reconsideration ?

> will validation check that a repository does
> something sensible with its data for Dublin Core as well as making sure it
> passes an XML validator?

right now, unfortunately, no ... because the reference schema validator
does not do regular expression processing yet ... maybe someone will do
this some time in the near future ... but then generality does become an
issue - who defines what is sensible ? i know there is a std date
format, but is there a definition for a std name format ?


hussein suleman -- hussein@vt.edu -- vtcs -- http://purl.org/net/hussein