[OAI-implementers] Re: [oai-alpha] Re: Help!
herbert van de sompel
Mon, 29 Jan 2001 10:21:36 -0500
Quite an interesting discussion regarding the exception handling (400,
empty records et all).
When writing the specs as a result of the discussion over the weekend 2
weeks ago, my goal was to distinguish between the handling of:
- illegal verbs,
- illegal arguments,
- values (both illegal and not leading to any result).
I thought that was a pretty clear line to draw:
- illegal protocol syntax: usage of illegal verbs and illegal arguments
results in 400
- something wrong with argument values: usage of illegal values and/or
values that lead to no result yield "empty" responses
The current dicussion seems to suggest that illegal values should also
result in a 400. But from the discussion, it shows that defining what
an illegal value is isn't all that simple: illegal as to the protcol
(e.g. illegal format for datestamp) or illegal in a certain repository
(e.g. non-supported metadataPrefix), ...
While I do agree that the provision of a datestamp with an illegal
syntax can be considered to be illegal protocol syntax, I remain tempted
to stick with the original concept, whereby everything that is releated
to values of arguments is NOT handled with a 400, but with an "empty"
response. Please note that the out-of-context usage of a
resumptionToken falls under the category of "illegal protocol syntax"
because the section on resumptionTokens explicitely says "all other
usage of resumptionTokens is illegal and hence returns 400".
I think that the issue re oai-format identifiers that was brought up
supports the above approach: some repos will use oai-formated
identifiers, others will not. similarly, other xsd's will be used as
"description" containers, some of which may limit the validity of other
argument values (valid set values, for instance). if we take the return
of 400 down to the level of argument values, and in addition take into
account these repo (or community) specific issues in the decission
whether an argument value is "legal" or "illegal" (hence in deciding
whether to return 400 or not), all repos will do exception handling in
different manners. I am not too enthusiastic about that idea.
I am very interested in comments.
Simeon Warner wrote:
> On Sat, 27 Jan 2001, ePrints Support wrote:
> > (from oai-alpha)
> > On Fri, Jan 26, 2001 at 06:26:33PM -0700, Simeon Warner wrote:
> > > I agree that the following can be illegal in fairly obvious
> > > ways:
> > > from & until (illegal dates)
> > > identifier (illegal uri)
> > > resumptionToken (spec says illegal use and expired will give 400)
> > >
> > > However, according to the schemas of verbs that return values that
> > > will be used for set and metadataPrefix, they also be illegal:
> > > set (doesn't match "([a-zA-Z0-9_])+(:[A-Za-z0-9]+)*")
> > > metadataPrefix (doesn't match "[a-zA-Z0-9_]+")
> > from & until: Illegal dates (I agree)
> > identifier: what's an illegal uri?
> > a) one which dosn't match oai:[a-z]+:.* (regexp may be slightly off)
> > or
> > b) one which dosn't match oai:nameofarchive:.*
> > I suggest (a) - it's possible that an archive could mirror OAI records
> > from another archive AS WELL as it's own. (Isn't it?)
> My feeling was that only a) qualifies for a 400 response.
> > resumptionToken: If an archive dosn't support this
> > then I suppose it should always give a 400 error.
> > Isn't there an 'expired' return code in http? it's confusing giving
> > the same response for 'illegal' and 'expired'
> The spec says there should be 400 in both cases. Any sensible
> harvester will know that it is giving back a once-valid
> resumptionToken and hence 400 => expired.
> > set: not matching [a-zA-Z0-9_])+(:[A-Za-z0-9]+)* is a 400 error but
> > how about a set which passes the spec but isn't in the archive?
> > I suggest it just returns a header with no results in that case.
> The second case should certainly return header with no results, only
> illegal value (not matching regexp) gives 400.
> > metadataPrefix: similar. Not matching [a-zA-Z0-9_]+ is illegal (400)
> > but what happens if it passes the regexp but isn't in the list
> > supported the archive?
> > Again, I suggest it just returns a header with no results in this case.
> Again, unrecognized/unsupported should return header with no results,
> only illegal value gives 400.
> > Other queries:
> > oai_dc: When should we put a 'oai_' before the metadataPrefix,
> > exactly what does it mean (why isn't it just dc?)
> My understanding is that the metadataPrefixes are simply strings
> returned by ListMetadataFormats which may be reused in requests
> that specify a metadataPrefix to request metedata according the
> corresponding schema in the ListMetadataFormats response.
> Further, 'oai_dc' is the name oai has chosen to refer to dc
> by (and everyone must support it and not call it 'wibble'
> instead). Given that the metadataPrefixes are just shorthand
> names to refer to the schema, I don't know why it was necessary
> to add the 'oai_'.
> > inside <metadata></metadata> is ANYTHING defined by OAI or
> > is everything, including the initial tags <dc></dc> and namespace
> > etc. defined by the metadata standard?
> As far as I understand it (which is not really very well), everything
> from the initial <dc ...> tag to the </dc> is specified by the
> dc schema (http://www.openarchives.org/OAI/dc.xsd), or other
> schema for other metadata formats. In the dc schema it says:
> <element name="dc" type="dc:dublincoreType"/>
> > --
> > Christopher Gutteridge
> > ePrints Technical Support
> > email@example.com
> > _______________________________________________
> > OAI-implementers mailing list
> > OAIfirstname.lastname@example.org
> > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers
> OAI-implementers mailing list
Herbert Van de Sompel
Visiting Assistant Professor
Cornell University -- Computer Science
tel + 1 - 607 - 255 - 3085
fax + 1 - 607 - 255 - 4428
digital life in libraries used to be primitive