[OAI-implementers] Qualified Dublin Core

Timothy W. Cole t-cole3 at uiuc.edu
Tue Aug 10 15:46:31 EDT 2004


I take your point, Jeff, and Simeon's as well, but I still don't entirely
agree. Let me preface my response as follows:

There seem to be three major arguments (at least) that have been advanced
for why OAI should host a canonical container XSD for qualified DC similar
to the oai_dc.xsd currently hosted by OAI for simple DC:

reason 1. It would facilitate OAI metadata providers who want to provide
metadata records in qualified DC and it is uncertain when or even whether
DCMI will ever choose to assign a top-level DCQ namespace and make an XSD
usable for that purpose. For the reasons Pete discussed, DCMI doesn't seem
inclined to want to do this.

reason 2. It would cut down on redundancies in namespaces and schemas, and
thereby facilitate development of web services / automatic crosswalk
applications like the ones you're building.

reason 3. Harvesters need the assurance that harvesting records only in
canonical formats brings.

I'm becoming convinced of the strength of reason 1. Though I still have
qualms about whether OAI really is the best location for canonical metadata
schemas and namespaces of this sort, at least the XSD we're talking about,
since it imports namespaces and schemas that are maintained on the DCMI
site, would be relatively simple and low maintenance. So, maybe adding an
OAI-blessed XSD for qualified DC wouldn't be a bad idea in and of itself
(though of course it sets a pesky precedent, and I have no idea how those
closer to day-to-day maintenance of the OAI site feel about doing this).

I'm unconvinced of reasons 2 and 3, based on the following:

- There are that I know of offhand 5 XSDs currently being used by OAI
metadata providers for metadata formats based on qualified DC. They are (in
order of frequency of use):

http://www.language-archives.org/OLAC/1.0/olac.xsd

http://ns.nsdl.org/schemas/nsdl_dc/nsdl_dc_v1.02.xsd
(I'm treating the v1.00 and v1.01 versions of the NSDL XSD as congruent with
v1.02 for this discussion.) 

http://IMLSDCC.grainger.uiuc.edu/schemas/cdp_dc_v1.00.xsd

http://cicharvest.grainger.uiuc.edu/schemas/QDC/2004/07/14/CICQualifiedDC.xs
d

http://epubs.cclrc.ac.uk/xsd/qdc.xsd

The first three XSDs all add to qualified DC, that is they all include
additional elements, refinements, and/or encoding schemes not included in
the dc, dcterms, or dcmitypes namespaces. So sites using the first 3 schemas
would likely not be able to switch over to a "pure" canonical qualified DC
schema even if one were available. The 4th XSD on the list is for a project
we're just starting here at Illinois with partners in the CIC. At present
our CIC qualified DC schema does not augment qualified DC, but my
expectation is that it will do so soon, so again, we'll likely not be able
to switch over to a pure canonical XSD for qualified DC even if one becomes
available.

This suggests to me that we're going to see a large number of instances
where projects choose to extend qualified DC, in most cases for reasons and
needs very specific to their local projects. The most frequent additional
extensions needed seem to be in the form of added encoding schemes. 

So I'm not sure how successful posting a canonical XSD for qualified DC will
be in keeping a lid on the number of XSDs and namespaces used in the OAI
universe, and I'm not sure harvesters can count on (or really should ask)
providers to export both in their extended qualified DC and a canonical form
of qualified DC. Doing so might simplify (slightly) the service providers
task, but I'm not sure it's a compelling case, especially from the
perspective of the data provider. (A complicating consideration is that any
of the above schemas could actually be used by someone who did just have
pure qualified DC -- since all augmentations are entirely optional).

Hence my suggestion that harvesters, cross-walks, transformations, and other
such services might do better to key off of embedded namespaces rather than
specific XSD or even top-level namespaces. And though clearly data providers
could go out of their way to import qualified DC namespaces into their local
XSDs and then not use those namespaces, that seems unlikely -- so I disagree
with Simeon that harvesters should steer clear of locally augmented formats
based on qualified DC on such an assumption or for fear they won't be able
to extract enough useful information from records that maybe do contain
additional content in other namespaces. Possibly there's some risk, but it
seems to me that a dc:title element or dcterms:created refinement still
means much the same whether embedded in a canonical qualified DC record or
in a CDP DC augmented qualified DC record.

And I don't see why you couldn't write an XSLT to crosswalk from qualified
DC to MARC that could be applied not only to records of "pure" qualified DC,
but also to OLAC DC or NSDL DC, or CDP DC. Obviously such a generic
crosswalk would drop local encoding schemes like olac:linguistic-type,
olac:linguistic-field, and nsdl:GEM, and refinements like
cdp:holdingInstitutions and cdp:thumbnailIdentifier, but for many purposes
that would be okay, especially if the XSLT were smart enough to take
advantage of any xs:substitutionGroup information contained in the XSDs
referenced by the instances (e.g., so as to know that
cdp:thumbnailIdentifier was a refinement of dc:identifier). 

It may be a little more work, but given the actual trend to date of data
providers wanting to augment qualified DC with local semantics, I think
we'll need to build our applications smart enough to deal with such
diversity. And if we do that, it doesn't really matter if there is a
multiplicity of top-level container schemas for qualified DC (as long as
they all reference the appropriate DCMI component namespaces).

Tim Cole
University of Illinois at UC 

-----Original Message-----
From: oai-implementers-bounces at openarchives.org
[mailto:oai-implementers-bounces at openarchives.org] On Behalf Of Young,Jeff
Sent: Tuesday, August 10, 2004 8:51 AM
To: Simeon Warner; oai-implementers at oaisrv.nsdl.cornell.edu
Subject: RE: [OAI-implementers] Qualified Dublin Core

I agree with Simeon. Lately I've been creating dynamically configured web
applications built with distributed independent web services (OAI and SRW in
particular). The more schemas and protocols they have in common, the more
magic that can happen. DCQ elements hidden behind differing namespace
containers and schemas would greatly diminish its value. 

For example, I am working with Jean Godby on a catalog of XSLT crosswalks
(http://errol.oclc.org/schemaTrans.oclc.org.search). Redundant namespaces
will only clutter it up and make it harder for people to choose an
appropriate crosswalk from a list when then need one.

Jeff

> -----Original Message-----
> From: Simeon Warner [mailto:simeon at cs.cornell.edu]
> Sent: Monday, August 09, 2004 7:02 PM
> To: oai-implementers at oaisrv.nsdl.cornell.edu
> Subject: RE: [OAI-implementers] Qualified Dublin Core
> 
> 
> I think the problem with "standard elements in any wrapper" approach 
> is that a harvester has no easy way to know up-front what it might be 
> getting if it harvests records in a particular format. Harvesting a 
> metadata format understood to be "only QDC elements" (nothing else, no 
> funny
> business) gives rather more assurance of intelligibility to a 
> harvester that understands QDC. A canonical schema seems the simplest 
> way to indicate this (notwithstanding versioning issues mentioned 
> earlier by Pete Johnston).
> 
> Cheers,
> Simeon
> 
> On Mon, 9 Aug 2004, Timothy W. Cole wrote:
> > Jeff-
> >
> > My take is that OAI shouldn't want to get back in business of 
> > hosting schemas or namespaces for metadata formats. We went to some 
> > trouble to
> get
> > away from that when transitioning from 1.1 to 2.0. A blessed 
> > application namespaces for qualified DC should be left up to the DCMI.
> >
> > While DCMI decides how they want to handle things (and I know they 
> > won't
> be
> > quick), solutions like the one at CCLRC (we've done much the same 
> > here
> for a
> > couple of projects) and NSDL are fine. Any namespace-aware OAI
> application
> > should ignore the locally created namespace and hone in on the dc,
> dcterms,
> > and dcmitype namespaces and thereby be able to use those elements
> without
> > any problems. If your OAI service provider respects namespaces, it
> shouldn't
> > matter what namespace the container element is in -- that's why the 
> > XML Schemas posted on the DCMI Website were done that way.
> >
> > Isn't that good enough for harvesting purposes? Are am I missing a
> subtle
> > consideration that requires a canonical namespace for the container
> element?
> >
> > Tim Cole
> > University of Illinois at UC
> >
> > -----Original Message-----
> > From: oai-implementers-bounces at openarchives.org
> > [mailto:oai-implementers-bounces at openarchives.org] On Behalf Of
> Young,Jeff
> > Sent: Monday, August 09, 2004 10:33 AM
> > To: Mascord, M (Matthew) ; Young,Jeff; 
> > oai-implementers at oaisrv.nsdl.cornell.edu
> > Cc: LeVan,Ralph; Hickey,Thom
> > Subject: RE: [OAI-implementers] Qualified Dublin Core
> >
> > Yes, this is the problem. Now I have two to choose from. This one, 
> > and
> the
> > one created by NSDL. I'm sure there are others out there. For the 
> > sake
> of
> > interoperability, it seems to me that the OAI community should bless
> (and
> > host?) such an "application profile" schema.
> >
> > Jeff
> >
> > > -----Original Message-----
> > > From: Mascord, M (Matthew) [mailto:M.Mascord at rl.ac.uk]
> > > Sent: Monday, August 09, 2004 11:21 AM
> > > To: 'Young,Jeff'; oai-implementers at oaisrv.nsdl.cornell.edu
> > > Cc: LeVan,Ralph; Hickey,Thom
> > > Subject: RE: [OAI-implementers] Qualified Dublin Core
> > >
> > > Hi -
> > >
> > > I am the developer of an OAI compatible institutional repository 
> > > for the UK research council CCLRC.  The URL is
http://epubs.cclrc.ac.uk.
> > > We are attempting to capture & make publicly accessible any 
> > > scientific research that has benefitted from the use of CCLRC's 
> > > facilities or expertise.  We recently went live on the OAI repository
network:
> > > http://epubs.cclrc.ac.uk/oai?verb=Identify.
> > >
> > > We provide metadata in both Simple and Qualified Dublin Core but 
> > > had the same problem as you in finding an authoritative XML schema 
> > > for Qualified Dublin Core.  In the end we created our own that 
> > > includes the schema defined at 
> > > http://dublincore.org/schemas/xmls/qdc/2003/04/02/qualifieddc.xsd 
> > > and described in the Dublin Core Note at 
> > > http://dublincore.org/schemas/xmls/qdc/2003/04/02/notes/.  This 
> > > defines a container element into which elements from the dcterms 
> > > and dc namespaces may be placed.
> > >
> > > I'm not sure if this is the best approach so would appreciate any 
> > > feedback on this.  Our OAI implementation can be tested at 
> > > http://epubs.cclrc.ac.uk/oaitest.
> > >
> > > Kind Regards,
> > > Matthew Mascord
> > > e-Library Software Developer, CCLRC, UK
> > >
> > >
> > > > -----Original Message-----
> > > > From: oai-implementers-bounces at openarchives.org
> > > > [mailto:oai-implementers-bounces at openarchives.org]On Behalf Of 
> > > > Young,Jeff
> > > > Sent: 09 August 2004 16:03
> > > > To: oai-implementers at oaisrv.nsdl.cornell.edu
> > > > Cc: LeVan,Ralph; Hickey,Thom
> > > > Subject: [OAI-implementers] Qualified Dublin Core
> > > >
> > > >
> > > > I'm looking for an XML Schema for Qualified Dublin Core for use 
> > > > in OAI repositories. I poked around the UIUC OAI Registry, but 
> > > > all I found was a couple of ad hoc schemas used by repositories 
> > > > that appear to be defunct.
> > > > Ideally, though, the existence and use of such a schema should 
> > > > be shared across a broad community and not ad hoc.
> > > >
> > > > Next, I searched in Google and OAForum but all I found was a 
> > > > reference to a preliminary effort to establish such a schema 
> > > > (http://www.ukoln.ac.uk/metadata/dcmi/xmlschema/). This 
> > > > particular document discusses a sample application schema for a 
> > > > DCQ container, but the implication is that the final schema must 
> > > > be decided by the specific application (e.g OAI?). Apparently, 
> > > > this has never been done.
> > > >
> > > > Can someone provide some guidance for doing this?
> > > >
> > > > Thanks.
> > > >
> > > > Jeff
> > > >
> > > > ---
> > > > Jeffrey A. Young
> > > > Software Architect
> > > > Office of Research, Mail Code 710 OCLC Online Computer Library 
> > > > Center, Inc.
> > > > 6565 Frantz Rd.
> > > > Dublin, OH 43017-3395
> > > > www.oclc.org
> > > >
> > > > Voice: 614-764-4342
> > > > Voice: 800-848-5878, ext. 4342
> > > > Fax: 614-718-7477
> > > > Email: jyoung at oclc.org
> > > >
> > > > _______________________________________________
> > > > OAI-implementers mailing list
> > > > List information, archives, preferences and to unsubscribe:
> > > > http://openarchives.org/mailman/listinfo/oai-implementers
> > > >
> >
> > _______________________________________________
> > OAI-implementers mailing list
> > List information, archives, preferences and to unsubscribe:
> > http://openarchives.org/mailman/listinfo/oai-implementers
> >
> >
> > _______________________________________________
> > OAI-implementers mailing list
> > List information, archives, preferences and to unsubscribe:
> > http://openarchives.org/mailman/listinfo/oai-implementers
> >
> 
> _______________________________________________
> OAI-implementers mailing list
> List information, archives, preferences and to unsubscribe:
> http://openarchives.org/mailman/listinfo/oai-implementers

_______________________________________________
OAI-implementers mailing list
List information, archives, preferences and to unsubscribe:
http://openarchives.org/mailman/listinfo/oai-implementers




More information about the OAI-implementers mailing list