[OAI-implementers] XML Schema problem?

Jeffrey A. Young jyoung1@columbus.rr.com
Fri, 20 Apr 2001 19:10:48 -0400


This is a multi-part message in MIME format.

------=_NextPart_000_000C_01C0C9CD.9BB9D620
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit

Someone noticed that my OAIHarvester isn't working correctly lately. It
turns out that the Xerces XML parser is convinced that all the records I
harvest are flagged as status="deleted". Since this clearly isn't the case,
I started stripping the program down until I had a small example program
showing this effect. The Java source code is attached. Basically, if I do
DocumentBuilderFactory.setValidating(true) and then convert the XML to a DOM
Document, it silently "corrects" my records to status="deleted". If I dump
the Document, all looks fine, but when I actually query the status
attribute, it reports back with a value of "deleted". On the other hand, if
I specify setValidating(false), everything works fine. I suspect the problem
is that the XML Schema needs to make the status attribute optional. Another
possibility is that Xerces is processing the XML Schema incorrectly. I can
ignore the problem by always using setValidating(false), but that doesn't
seem right. If someone has a better solution, I would appreciate it. Thanks.

Jeff

---
Jeffrey A. Young
Senior Consulting Systems Analyst
Office of Research, Mail Code 710
OCLC Online Computer Library Center, Inc.
6565 Frantz Road
Dublin, OH   43017-3395
www.oclc.org

Voice:	614-764-4342
Fax:		614-764-2344
Email:	jyoung@oclc.org



------=_NextPart_000_000C_01C0C9CD.9BB9D620
Content-Type: application/octet-stream;
	name="Test.java"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="Test.java"

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import java.io.*;
import org.w3c.dom.*;
import org.xml.sax.*;

public class Test {
    public static void main(String[] args) {
	try {
	    DocumentBuilderFactory factory =3D =
DocumentBuilderFactory.newInstance();
	    String arg =3D "";
	    if (args.length =3D=3D 1)
		arg =3D args[0];
	    if (arg.equals("true")) {
		factory.setValidating(true);
	    } else if (arg.equals("false")) {
		factory.setValidating(false);
	    } else {
		System.err.println("Usage: java Test [true|false]");
		System.exit(-1);
	    }
	   =20
	    factory.setNamespaceAware(true);
	    DocumentBuilder parser =3D factory.newDocumentBuilder();
	    String xml =3D "<?xml version=3D\"1.0\" =
encoding=3D\"UTF-8\"?><ListRecords =
xmlns=3D\"http://www.openarchives.org/OAI/1.0/OAI_ListRecords\" =
xmlns:xsi=3D\"http://www.w3.org/2000/10/XMLSchema-instance\" =
xsi:schemaLocation=3D\"http://www.openarchives.org/OAI/1.0/OAI_ListRecord=
s =
http://www.openarchives.org/OAI/1.0/OAI_ListRecords.xsd\"><responseDate>2=
001-04-20T14:48:40-05:00</responseDate><requestURL>http://orc:4342/etdcat=
/servlet/OAIHandler?metadataPrefix=3Doai_dc&amp;verb=3DListRecords</reque=
stURL><record><header><identifier>oai:etdcat:ocm02999966</identifier><dat=
estamp>2001-02-02</datestamp></header><metadata><dc =
xmlns=3D\"http://purl.org/dc/elements/1.1/\" =
xmlns:xsi=3D\"http://www.w3.org/2000/10/XMLSchema-instance\" =
xsi:schemaLocation=3D\"http://purl.org/dc/elements/1.1/ =
http://www.openarchives.org/OAI/dc.xsd\"></dc></metadata></record><resump=
tionToken>987796143360:100:oai_dc</resumptionToken></ListRecords>";
	    StringReader sr =3D new StringReader(xml);
	    InputSource is =3D new InputSource(sr);
	    Document doc =3D parser.parse(is);
	    Element docEl =3D doc.getDocumentElement();
	    NodeList list =3D docEl.getElementsByTagName("record");
	    Element recEl =3D (Element)list.item(0);
	    System.out.println("status =3D " + recEl.getAttribute("status"));
	} catch (Exception e) {
	    e.printStackTrace();
	}
    }
}

------=_NextPart_000_000C_01C0C9CD.9BB9D620--