[OAI-implementers] A question for implementers of harvesters

Caroline Arms caar at loc.gov
Tue May 15 11:39:35 EDT 2007

This question is a structural question about XML Schema in the OAI context.   We have a project to develop an XML Schema for the structured Format Description Documents (FDDs) in http://www.digitalpreservation.gov/formats/.  One objective is to make the FDDs OAI-harvestable.

One aspect of the FDDs  (example: http://www.digitalpreservation.gov/formats/fdd/fdd000022.shtml)  is that the text in many of the table cells needs to be richer than plain text. We have identified a small set of HTML tags that we want to be able to use in a complexType to be called notesText.   We want the schema to limit us to that small set of tags as we edit FDDs.

We have two structural options for the fdd.xsd schema.  We plan to have the fdd.xsd schema use the namespace "fdd" and require that element names be qualified with their namespace.

1.  We could put the HTML tags into a separate .xsd file (to be included in fdd.xsd) that does not require the elements to be qualified.  This way we could use the HTML tags directly within the chunks of notesText.  They would be easily recognized, have familiar semantics, and be handled trivially in XSLT transformations to HTML.

2.  For the convenience of having the specification in a single .xsd file, we could use different names (presumably in the fdd namespace).  These would be more cumbersome to read in the raw XML and need more work to convert to HTML.

What do you see as pros and cons between these options?  For example, will OAI-harvesting applications have problems with the two-schema approach (option 1).

Thanks for any feedback.   I will also be asking potential users of the harvested FDDs the equivalent question.

Caroline Arms
Library of Congress, Office of Strategic Initiatives
caar at loc.gov

More information about the OAI-implementers mailing list