[OAI-implementers] A question for implementers of harvesters

Gary McGath gary at hulmail.harvard.edu
Tue May 15 12:06:45 EDT 2007

This interests me from both the OAI and the format description sides. I 
don't see any particular difficulty in having the HTML tags in a 
separate XSD; the additional work should be minor at most. The 
transparency of not using false tags should make up for any additional 
programming effort.

Caroline Arms wrote:
> This question is a structural question about XML Schema in the OAI context.   We have a project to develop an XML Schema for the structured Format Description Documents (FDDs) in http://www.digitalpreservation.gov/formats/.  One objective is to make the FDDs OAI-harvestable.
> One aspect of the FDDs  (example: http://www.digitalpreservation.gov/formats/fdd/fdd000022.shtml)  is that the text in many of the table cells needs to be richer than plain text. We have identified a small set of HTML tags that we want to be able to use in a complexType to be called notesText.   We want the schema to limit us to that small set of tags as we edit FDDs.
> We have two structural options for the fdd.xsd schema.  We plan to have the fdd.xsd schema use the namespace "fdd" and require that element names be qualified with their namespace.
> 1.  We could put the HTML tags into a separate .xsd file (to be included in fdd.xsd) that does not require the elements to be qualified.  This way we could use the HTML tags directly within the chunks of notesText.  They would be easily recognized, have familiar semantics, and be handled trivially in XSLT transformations to HTML.
> 2.  For the convenience of having the specification in a single .xsd file, we could use different names (presumably in the fdd namespace).  These would be more cumbersome to read in the raw XML and need more work to convert to HTML.
> What do you see as pros and cons between these options?  For example, will OAI-harvesting applications have problems with the two-schema approach (option 1).
> Thanks for any feedback.   I will also be asking potential users of the harvested FDDs the equivalent question.

Gary McGath
Digital Library Software Engineer
Harvard University Library Office for Information Systems

More information about the OAI-implementers mailing list