[OAI-implementers] XML Schemas and Xerces again

herbert van de sompel herbertv@cs.cornell.edu
Wed, 25 Apr 2001 09:46:52 -0400


hi Jeff,

you are well on your way to becoming the master schema-debugger!

Since I share your impression that the schema are correct for the issues
you list (while Xerces generates an error), I think we may be witnessing
the following:

* there doesn't seem to be a validator out there that we can really
fully trust.  for instance, there are several issues that XSV does not
attempt to validate (the regular expression facet, for instance, as far
as I can tell). also, other validators seem to be generating errors when
nothing seems to be wrong, really.  it happened when we were validating
the schema before releasing the protocol, using XML Spy.  just like
Xerces generates an error for uriReference in relation to the
"identifier", XML Spy generated an error for requestURL.  as a result,
we changed the type of requestURL from uriReference to string.  clearly,
the requestURL as we use it in the OAI protocol seems to be a valid URI!

* there may be a Schema version issue at play.  remember we are using
the specs that corresponds with the http://www.w3.org/2000/10/XMLSchema
namespace, which corresponds with the XSV validator at
http://www.w3.org/2000/09/webdata/xsv .  as you know there is a more
recent XML Schema version that has reached the status of "proposed
recommendation" (not yet "recommendation") with namespace
http://www.w3.org/2001/XMLSchema and XSV validator at
http://www.w3.org/2001/03/webdata/xsv .  the validator for the recent
specs will definitely not validate schema that are written to be
compliant with the older spec.  as I mentioned before, we will have to
make a decission re what to do with our OAI Schema. we are currently
still investigating the issues involved.

many greetings

herbert

"Young,Jeff" wrote:
> 
> I'm happy to say that the status=deleted problem appears to be resolved.
> Unfortunately, I now seem to have a different (unrelated) problem. Someone
> reported to me that Xerces 1.3.1 is reporting an XML schema error where
> 1.3.0 didn't. It seems that I had failed to call setErrorHandler() which is
> key to reporting any validation errors. Xerces 1.3.0 let this slide where
> 1.3.1 complains about it. Now that I've corrected this oversight, I'm now
> seeing some parser errors related to the XML schema. I've attached another
> small demo application that shows the effects. To add to the confusion,
> 1.3.0 reports a different error than does 1.3.1.
> 
> Using Xerces 1.3.0, the demo application produces:
> 
> error
> org.xml.sax.SAXParseException: Datatype error: In element 'identifier' :
> Value 'oai:etdcat:ocm02999966' is a Malformed URI .
>         at
> org.apache.xerces.framework.XMLParser.reportError(XMLParser.java:1068)
>         at
> org.apache.xerces.validators.common.XMLValidator.checkContent(XMLValidator.j
> ava:3609)
>         at
> org.apache.xerces.validators.common.XMLValidator.callEndElement(XMLValidator
> .java:1133)
>         at
> org.apache.xerces.framework.XMLDocumentScanner$ContentDispatcher.dispatch(XM
> LDocumentScanner.java:1201)
>         at
> org.apache.xerces.framework.XMLDocumentScanner.parseSome(XMLDocumentScanner.
> java:381)
>         at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:952)
>         at
> org.apache.xerces.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:12
> 3)
>         at Test.main(Test.java:34)
> 
> Using Xerces 1.3.1, the demo produces:
> 
> error
> org.xml.sax.SAXParseException: The content of element type "metadata" must
> match "##any:uri=http://www.openarchives.org/OAI/1.0/OAI_ListRecords".
>         at
> org.apache.xerces.framework.XMLParser.reportError(XMLParser.java:1067)
>         at
> org.apache.xerces.validators.common.XMLValidator.reportRecoverableXMLError(X
> MLValidator.java:1689)
>         at
> org.apache.xerces.validators.common.XMLValidator.callEndElement(XMLValidator
> .java:1353)
>         at
> org.apache.xerces.framework.XMLDocumentScanner$ContentDispatcher.dispatch(XM
> LDocumentScanner.java:1205)
>         at
> org.apache.xerces.framework.XMLDocumentScanner.parseSome(XMLDocumentScanner.
> java:381)
>         at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:952)
>         at
> org.apache.xerces.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:17
> 2)
>         at Test.main(Test.java:34)
> 
> As far as I can tell, the schema look fine. My assumption, at this point, is
> that Xerces is at fault and my only recourse is turn off validation. I must
> also admit the possibility that my program is flawed in some way. On the
> slim chance that I've found the 2nd and 3rd XML schema errors within the
> span of a week, though, I thought I'd pass along my findings.
> 
>  <<Test.java>>
> Cheers,
> 
> Jeff
> 
> ---
> Jeffrey A. Young
> Senior Consulting Systems Analyst
> Office of Research, Mail Code 710
> OCLC Online Computer Library Center, Inc.
> 6565 Frantz Road
> Dublin, OH   43017-3395
> www.oclc.org
> 
> Voice:  614-764-4342
> Voice:  800-848-5878, ext. 4342
> Fax:    614-718-7477
> Email:  jyoung@oclc.org
> 
>   ------------------------------------------------------------------------
>                 Name: Test.java
>    Test.java    Type: unspecified type (application/octet-stream)
>             Encoding: quoted-printable