[OAI-implementers] OAI validation problem

Young,Jeff jyoung@oclc.org
Thu, 22 May 2003 13:40:45 -0400


I would argue that OAI has done an excellent job of decoupling OAI responses
so they can stand alone. These two cases related to deleted records are the
only exceptions to this so far.

I've been amazed by the potential of including XSLT stylesheet reference
with my OAI responses so they can be rendered in a browser. Herbert, Thom
Hickey, and I plan to publish a paper in D-Lib in part to demonstrate this,
and I will be presenting some examples at the ALA conference next month. 

This type of coupling between responses prevents me from doing anything
useful with deleted records because browsers can only deal with one response
at a time.

I understand that OAI is a harvesting protocol and that I'm pushing the
boundaries, but it would be a shame to reject the possibilities when it's so
close to being much more than that.

Jeff

> -----Original Message-----
> From: Simeon Warner [mailto:simeon@cs.cornell.edu]
> Sent: Thursday, May 22, 2003 1:24 PM
> To: 'OAI-implementers (E-mail)'
> Subject: RE: [OAI-implementers] OAI validation problem
> 
> 
> 
> There seem to be two issues:
> 
> 1) the metadata format or a record is identified via a metadataPrefix
> which may only be indirectly linked to a schemaLocation via the
> ListMetadataFormats response. 
> 
> I have previsouly wondered whether we should have avoided introducing
> metadataPrefix at all and used just the namespace URI. 
> However, given that
> we have this level of indirection I'm not sure it is a bad 
> thing to have a
> single canonical place (the ListMetadataFormats response) for the
> information linking metadataPrefix to namespace and schemaLocation. My
> harvester does ListMetadataFormats request before harvesting 
> to check the
> metadataPrefix is supported anyway.
> 
> 2) continuation responses include just a resumptionToken and not a
> complete set of initial parameters for the request. These 
> responses are
> thus not self-contained.
> 
> I've always thought that one should regard the complete list, 
> the set of
> all responses from the first (with initial request recorded) 
> to the last
> (indicated by null resumptionToken), as "the response". I 
> this case there
> is no ambiguity.
> 
> Cheers,
> Simeon.
> 
> 
> On Thu, 22 May 2003, Young,Jeff wrote:
> > It's worse than I thought. If the deleted records occur in 
> a resumptionToken
> > ListRecords response, there is nothing whatsoever in the response to
> > indicate which format relates to the deleted records. Not 
> only are deleted
> > records coupled to ListMetadataFormats responses, they are 
> also coupled to
> > the initial ListRecords response!
> > 
> > Jeff
> > 
> > > -----Original Message-----
> > > From: Young,Jeff 
> > > Sent: Thursday, May 22, 2003 11:53 AM
> > > To: Young,Jeff; 'Hussein Suleman'; OAI-implementers (E-mail)
> > > Subject: RE: [OAI-implementers] OAI validation problem
> > > 
> > > 
> > > Here's another example of where I'm having trouble processing 
> > > OAI responses with XSLT. Below is a deleted record:
> > > 
> > >   <record>
> > >     <header status="deleted">
> > >       <identifier>oai:arXiv.org:hep-th/9901007</identifier>
> > >       <datestamp>1999-12-21</datestamp>
> > >     </header>
> > >   </record>
> > > 
> > > This response indicates the deletion of a record in a 
> > > particular metadataFormat, not the deletion of the item. The 
> > > problem is that there is nothing in the response to indicate 
> > > which metadataFormat is being deleted except by looking at 
> > > /OAI-PMH/request/@metadataPrefix. Unfortunately, this isn't 
> > > deterministic because different repositories may use 
> > > different labels to refer to the same schemaLocation, so 
> > > again I'm forced to look up the metadataPrefix using a 
> > > separate ListMetadataFormats response. It would be nice if 
> > > the xsi:schemaLocation was immediately present.
> > > 
> > > Perhaps something like this would be ideal.
> > > 
> > >   <record>
> > >     <header status="deleted">
> > >       <identifier>oai:arXiv.org:hep-th/9901007</identifier>
> > >       <datestamp>1999-12-21</datestamp>
> > >     </header>
> > >     <DEFANGED_metadata xsi:schemaLocation="..." />
> > >   </record>
> > > 
> > > It's a shame, but it's probably too late to fix this problem.
> > > 
> > > Jeff
> > > 
> > > > -----Original Message-----
> > > > From: Young,Jeff [mailto:jyoung@oclc.org]
> > > > Sent: Thursday, May 22, 2003 9:16 AM
> > > > To: 'Hussein Suleman'; OAI-implementers (E-mail)
> > > > Subject: RE: [OAI-implementers] OAI validation problem
> > > > 
> > > > 
> > > > I've always made a point of being willfully ignorant about 
> > > > XML Schemas, but
> > > > it's time I gave it a try. The trick seems to be to define an 
> > > > abstract type
> > > > to use in place of <any namespace="##other".../> This 
> > > > abstract type would
> > > > then require the xsi:schemaLocation.
> > > > 
> > > > So, in place of this:
> > > > 
> > > > <complexType name="metadataType">
> > > > 	<sequence>
> > > > 		<any namespace="##other" 
> processContents="strict"/>
> > > > 	</sequence>
> > > > </complexType>
> > > > 
> > > > do something like:
> > > > 
> > > > <xsd:complexType name="abstractContent" abstract="true">
> > > > 	<xsd:sequence min-Occurs="1" maxOccurs="1">
> > > > 		<any namespace="##other" 
> processContents="strict" />
> > > > 	</xsd:sequence>
> > > > </xsd:complexType>
> > > > 
> > > > <complexType name="metadataType">
> > > > 	<xsd:complexContent>
> > > > 		<xsd:extension base="abstractContent">
> > > > 			<xsd:attribute name="xsi:schemaLocation"
> > > > type="xsd:string" use="required" />
> > > > 		</xsd:extension>
> > > > 	</xsd:complexContent>
> > > > </complexType>
> > > > 
> > > > 
> > > > If that isn't right, then maybe it's something like this:
> > > > 
> > > > <xsd:element name="Content" abstract="true" 
> > > type="abstractContent" />
> > > > 
> > > > <xsd:complexType name="abstractContent">
> > > >     <sequence>
> > > >         <any namespace="##other" processContents="strict" />
> > > >     </sequence>
> > > >     <xsd:attribute name="xsi:schemaLocation" type="xsd:string"
> > > > use="required" />
> > > > </xsd:complexType>
> > > > 
> > > > <complexType name="metadataType">
> > > >     <xsd:sequence>
> > > >         <xsd:element ref="Content" minOccurs="1" 
> maxOccurs="1" />
> > > >     </xsd:sequence>
> > > > </complexType>
> > > > 
> > > > No promises, though..
> > > > 
> > > > Jeff
> > > > 
> > > > 
> > > > > -----Original Message-----
> > > > > From: Hussein Suleman [mailto:hussein@cs.uct.ac.za]
> > > > > Sent: Wednesday, May 21, 2003 5:55 PM
> > > > > To: OAI-implementers (E-mail)
> > > > > Subject: Re: [OAI-implementers] OAI validation problem
> > > > > 
> > > > > 
> > > > > hi Jeff
> > > > > 
> > > > > some random thoughts ...
> > > > > - could the schema be modified to reflect a required 
> > > > > xsi:schemaLocation 
> > > > > attribute? that might be the easiest fix.
> > > > > - alternatively, does DOM3 propagate schema 
> information like DOM2 
> > > > > propagates namespaces? if so, then there might be a method to 
> > > > > directly 
> > > > > retrieve the schema for a given node/element.
> > > > > 
> > > > > ttfn,
> > > > > ----hussein
> > > > > 
> > > > > 
> > > > > Young,Jeff wrote:
> > > > > > I think I found a hole in the OAI validation mechanisms. I 
> > > > > believe the
> > > > > > contents of the <DEFANGED_metadata> element should 
> be required to have an
> > > > > > xsi:schemaLocation attribute to make it easier to identify 
> > > > > the schema for
> > > > > > the data. Without it, harvesters are forced to use the
> > > > > > /oai2:OAI-PMH/oai2:request/@metadataPrefix value and look 
> > > > > it up in the
> > > > > > ListMetadataFormats response, which is more trouble 
> > > than having it
> > > > > > immediately available as an attribute.
> > > > > > 
> > > > > > The examples in the OAI protocol document do show it as an 
> > > > > attribute, but
> > > > > > apparently the Repository Explorer and the Registration 
> > > > > validation available
> > > > > > on the OAI site don't check for it.
> > > > > > 
> > > > > > Jeff
> > > > > > 
> > > > > > ---
> > > > > > Jeffrey A. Young
> > > > > > Consulting Software Engineer
> > > > > > Office of Research, Mail Code 710
> > > > > > OCLC Online Computer Library Center, Inc.
> > > > > > 6565 Frantz Road
> > > > > > Dublin, OH   43017-3395
> > > > > > www.oclc.org
> > > > > > 
> > > > > > Voice:	614-764-4342
> > > > > > Voice:	800-848-5878, ext. 4342
> > > > > > Fax:	614-718-7477
> > > > > > Email:	jyoung@oclc.org
> > > > > > 
> > > > > > 
> > > > > > _______________________________________________
> > > > > > OAI-implementers mailing list
> > > > > > List information, archives, preferences and to unsubscribe:
> > > > > > 
> http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers
> > > > > > 
> > > > > 
> > > > > 
> > > > > -- 
> > > > > 
> > > > 
> > > 
> =====================================================================
> > > > > hussein suleman ~ hussein@cs.uct.ac.za ~ 
> > > http://www.husseinsspace.com
> > > > 
> > > 
> =====================================================================
> > > > 
> > > > 
> > > > _______________________________________________
> > > > OAI-implementers mailing list
> > > > List information, archives, preferences and to unsubscribe:
> > > > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers
> > > > 
> > > _______________________________________________
> > > OAI-implementers mailing list
> > > List information, archives, preferences and to unsubscribe:
> > > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers
> > > 
> > _______________________________________________
> > OAI-implementers mailing list
> > List information, archives, preferences and to unsubscribe:
> > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers
> > 
> 
> _______________________________________________
> OAI-implementers mailing list
> List information, archives, preferences and to unsubscribe:
> http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers
>