[OAI-implementers] A question for implementers of harvesters

Naomi Dushay Naomi at cs.cornell.edu
Tue May 15 13:16:33 EDT 2007


Caroline,
 
I've found the "best practices" documents at www.xfront.com to be excellent
in describing just the sort of trade-offs you're evaluating:
 
http://www.xfront.com/BestPracticesHomepage.html
 
- Naomi Dushay
Coordinator of Research & Development
Colorado State University Libraries
 

________________________________

From: oai-implementers-bounces at openarchives.org on behalf of Caroline Arms
Sent: Tue 5/15/2007 11:39 AM
To: oai-implementers at openarchives.org
Cc: Carl Fleischhauer; Ignacio GarciaDelCampo; Donald Emerson; Andrew Boyko
Subject: [OAI-implementers] A question for implementers of harvesters




This question is a structural question about XML Schema in the OAI context.
We have a project to develop an XML Schema for the structured Format
Description Documents (FDDs) in http://www.digitalpreservation.gov/formats/.
One objective is to make the FDDs OAI-harvestable.

One aspect of the FDDs  (example:
http://www.digitalpreservation.gov/formats/fdd/fdd000022.shtml)  is that the
text in many of the table cells needs to be richer than plain text. We have
identified a small set of HTML tags that we want to be able to use in a
complexType to be called notesText.   We want the schema to limit us to that
small set of tags as we edit FDDs.

We have two structural options for the fdd.xsd schema.  We plan to have the
fdd.xsd schema use the namespace "fdd" and require that element names be
qualified with their namespace.

1.  We could put the HTML tags into a separate .xsd file (to be included in
fdd.xsd) that does not require the elements to be qualified.  This way we
could use the HTML tags directly within the chunks of notesText.  They would
be easily recognized, have familiar semantics, and be handled trivially in
XSLT transformations to HTML.

2.  For the convenience of having the specification in a single .xsd file, we
could use different names (presumably in the fdd namespace).  These would be
more cumbersome to read in the raw XML and need more work to convert to HTML.

What do you see as pros and cons between these options?  For example, will
OAI-harvesting applications have problems with the two-schema approach
(option 1).

Thanks for any feedback.   I will also be asking potential users of the
harvested FDDs the equivalent question.

Caroline Arms
Library of Congress, Office of Strategic Initiatives
caar at loc.gov


_______________________________________________
OAI-implementers mailing list
List information, archives, preferences and to unsubscribe:
http://www.openarchives.org/mailman/listinfo/oai-implementers



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.openarchives.org/pipermail/oai-implementers/attachments/20070515/72d067e6/attachment.htm


More information about the OAI-implementers mailing list