[OAI-implementers] HTML in CDATA section in Dublic Core

WSimpson at wiley.co.uk WSimpson at wiley.co.uk
Mon Aug 9 10:59:09 EDT 2004

Thank you both Jeff and Simeon for your help.

FYI, Jeff I am trying to put the formatted abstract of an article into the
description field of Dublin Core.

I think an alternative meta-data prefix could be in order, though I am not
sure exactly what this will be.

||   Simeon Warner        |                                              |
||   <simeon at cs.cornell.ed|           To:        WSimpson at wiley.co.uk    |
||   u>                   |                                              |
||                        |   cc:        oai-implementers at openarchives.or|
||   04/08/2004 18:43     |   g                                          |
||                        |           Subject:        RE:                |
||                        |   [OAI-implementers] HTML in CDATA section in|
||                        |   Dublic Core                                |

I agree that including markup in the simple DC is a bad idea. The DC
usage guide (http://dublincore.org/documents/usageguide/elements.shtml)
puts it quite well:

> "Descriptive information can be copied or automatically extracted from
> the item if there is no abstract or other structured description
> available.  Although the source of the description may be a web page or
> other structured text with presentation tags, it is generally not good
> practice to include HTML or other structural tags within the Description
> element.  Applications vary considerably in their ability to interpret
> such tags, and their inclusion may negatively affect the
> interoperability of the metadata."

As Jeff sugguests, you can still provide richer metadata with a second
metadataPrefix (mantra: "OAI supports multiple parallel metadata
formats"). That way you can express richer metadata for applications that
understand it, without breaking the base-level interoperability provided
by DC.


On Wed, 4 Aug 2004, Young,Jeff wrote:
> I'm sure it's a bad idea.
> What are you trying to describe with that markup? If the markup reflects
> meaningful structure, you should consider creating an XML schema to
> it and provide that content under a separate metadataPrefix.
> Worst case, you could provide that content as XHTML under a separate
> metadataPrefix (called, say, 'xhtml').
> Jeff
> > -----Original Message-----
> > From: WSimpson at wiley.co.uk [mailto:WSimpson at wiley.co.uk]
> > Sent: Wednesday, August 04, 2004 11:32 AM
> > To: oai-implementers at openarchives.org
> > Subject: [OAI-implementers] HTML in CDATA section in Dublic Core
> >
> > Hi,
> >
> > I am generating Dublin Core meta-data for a customer as part of an OAI
> > static repository. The customer has asked me to put HTML formatting
> > elements (such as paragraph, subscript, and superscript) and HTML
> > character
> > entities inside elements of the Dublin Core XML representation. They
> > suggesting that I do so by wrapping the HTML in a CDATA section. While
> > this
> > will validate against the static repository schema, it does not seem to
> > in the spirit of Dublin Core. Is such an approach acceptable use of
> > Core within an OAI record, or should only plain text be used?
> >
> > Many thanks,
> >
> > Will.

More information about the OAI-implementers mailing list