[OAI-implementers] Qualified Dublin Core

Young,Jeff jyoung at oclc.org
Tue Aug 10 16:29:01 EDT 2004


Thanks for you comments, Tim. Here are my thoughts.

I'll have to review Pete's arguments, but it occurs to me that if all
creators of metadata formats (e.g. Library of Congress) punted on the
container issue, efforts like mine to build interoperable distributed
systems would be seriously taxed. I share your concern, though, that OAI
isn't the best host for a generic DCQ schema for the same reason it's not
the best host for oai_dc. For example, I want to disseminate DC records via
SRW. I feel a little dirty using the oai_dc schema, but what would be
better? (I have some radical ideas about how this could be resolved, but
they are unlikely to catch on. For the terminally curious, check out
http://errol.oclc.org/xmlregistry.oclc.org.html?set=XMLSchemas&metadataPrefi
x=oai_dc&verb=ListRecords).

I have no qualms about the proliferation of container schemas that merely
include DCQ other than to say I hope communities reuse them whenever
appropriate. My only concern is the proliferation of containers containing
pure DCQ. Specifically, my plan is to add a pure DCQ metadataFormat to
DSpace and my decision on the matter will affect dozens of repositories.
Because of the impact, I would hate for my uncritical proprietary solution
to become the de facto standard.

Jeff

> -----Original Message-----
> From: Timothy W. Cole [mailto:t-cole3 at uiuc.edu]
> Sent: Tuesday, August 10, 2004 3:47 PM
> To: 'Young,Jeff'; 'Simeon Warner'; oai-
> implementers at oaisrv.nsdl.cornell.edu
> Subject: RE: [OAI-implementers] Qualified Dublin Core
> 
> I take your point, Jeff, and Simeon's as well, but I still don't entirely
> agree. Let me preface my response as follows:
> 
> There seem to be three major arguments (at least) that have been advanced
> for why OAI should host a canonical container XSD for qualified DC similar
> to the oai_dc.xsd currently hosted by OAI for simple DC:
> 
> reason 1. It would facilitate OAI metadata providers who want to provide
> metadata records in qualified DC and it is uncertain when or even whether
> DCMI will ever choose to assign a top-level DCQ namespace and make an XSD
> usable for that purpose. For the reasons Pete discussed, DCMI doesn't seem
> inclined to want to do this.
> 
> reason 2. It would cut down on redundancies in namespaces and schemas, and
> thereby facilitate development of web services / automatic crosswalk
> applications like the ones you're building.
> 
> reason 3. Harvesters need the assurance that harvesting records only in
> canonical formats brings.
> 
> I'm becoming convinced of the strength of reason 1. Though I still have
> qualms about whether OAI really is the best location for canonical
> metadata
> schemas and namespaces of this sort, at least the XSD we're talking about,
> since it imports namespaces and schemas that are maintained on the DCMI
> site, would be relatively simple and low maintenance. So, maybe adding an
> OAI-blessed XSD for qualified DC wouldn't be a bad idea in and of itself
> (though of course it sets a pesky precedent, and I have no idea how those
> closer to day-to-day maintenance of the OAI site feel about doing this).
> 
> I'm unconvinced of reasons 2 and 3, based on the following:
> 
> - There are that I know of offhand 5 XSDs currently being used by OAI
> metadata providers for metadata formats based on qualified DC. They are
> (in
> order of frequency of use):
> 
> http://www.language-archives.org/OLAC/1.0/olac.xsd
> 
> http://ns.nsdl.org/schemas/nsdl_dc/nsdl_dc_v1.02.xsd
> (I'm treating the v1.00 and v1.01 versions of the NSDL XSD as congruent
> with
> v1.02 for this discussion.)
> 
> http://IMLSDCC.grainger.uiuc.edu/schemas/cdp_dc_v1.00.xsd
> 
> http://cicharvest.grainger.uiuc.edu/schemas/QDC/2004/07/14/CICQualifiedDC.
> xs
> d
> 
> http://epubs.cclrc.ac.uk/xsd/qdc.xsd
> 
> The first three XSDs all add to qualified DC, that is they all include
> additional elements, refinements, and/or encoding schemes not included in
> the dc, dcterms, or dcmitypes namespaces. So sites using the first 3
> schemas
> would likely not be able to switch over to a "pure" canonical qualified DC
> schema even if one were available. The 4th XSD on the list is for a
> project
> we're just starting here at Illinois with partners in the CIC. At present
> our CIC qualified DC schema does not augment qualified DC, but my
> expectation is that it will do so soon, so again, we'll likely not be able
> to switch over to a pure canonical XSD for qualified DC even if one
> becomes
> available.
> 
> This suggests to me that we're going to see a large number of instances
> where projects choose to extend qualified DC, in most cases for reasons
> and
> needs very specific to their local projects. The most frequent additional
> extensions needed seem to be in the form of added encoding schemes.
> 
> So I'm not sure how successful posting a canonical XSD for qualified DC
> will
> be in keeping a lid on the number of XSDs and namespaces used in the OAI
> universe, and I'm not sure harvesters can count on (or really should ask)
> providers to export both in their extended qualified DC and a canonical
> form
> of qualified DC. Doing so might simplify (slightly) the service providers
> task, but I'm not sure it's a compelling case, especially from the
> perspective of the data provider. (A complicating consideration is that
> any
> of the above schemas could actually be used by someone who did just have
> pure qualified DC -- since all augmentations are entirely optional).
> 
> Hence my suggestion that harvesters, cross-walks, transformations, and
> other
> such services might do better to key off of embedded namespaces rather
> than
> specific XSD or even top-level namespaces. And though clearly data
> providers
> could go out of their way to import qualified DC namespaces into their
> local
> XSDs and then not use those namespaces, that seems unlikely -- so I
> disagree
> with Simeon that harvesters should steer clear of locally augmented
> formats
> based on qualified DC on such an assumption or for fear they won't be able
> to extract enough useful information from records that maybe do contain
> additional content in other namespaces. Possibly there's some risk, but it
> seems to me that a dc:title element or dcterms:created refinement still
> means much the same whether embedded in a canonical qualified DC record or
> in a CDP DC augmented qualified DC record.
> 
> And I don't see why you couldn't write an XSLT to crosswalk from qualified
> DC to MARC that could be applied not only to records of "pure" qualified
> DC,
> but also to OLAC DC or NSDL DC, or CDP DC. Obviously such a generic
> crosswalk would drop local encoding schemes like olac:linguistic-type,
> olac:linguistic-field, and nsdl:GEM, and refinements like
> cdp:holdingInstitutions and cdp:thumbnailIdentifier, but for many purposes
> that would be okay, especially if the XSLT were smart enough to take
> advantage of any xs:substitutionGroup information contained in the XSDs
> referenced by the instances (e.g., so as to know that
> cdp:thumbnailIdentifier was a refinement of dc:identifier).
> 
> It may be a little more work, but given the actual trend to date of data
> providers wanting to augment qualified DC with local semantics, I think
> we'll need to build our applications smart enough to deal with such
> diversity. And if we do that, it doesn't really matter if there is a
> multiplicity of top-level container schemas for qualified DC (as long as
> they all reference the appropriate DCMI component namespaces).
> 
> Tim Cole
> University of Illinois at UC
> 
> -----Original Message-----
> From: oai-implementers-bounces at openarchives.org
> [mailto:oai-implementers-bounces at openarchives.org] On Behalf Of Young,Jeff
> Sent: Tuesday, August 10, 2004 8:51 AM
> To: Simeon Warner; oai-implementers at oaisrv.nsdl.cornell.edu
> Subject: RE: [OAI-implementers] Qualified Dublin Core
> 
> I agree with Simeon. Lately I've been creating dynamically configured web
> applications built with distributed independent web services (OAI and SRW
> in
> particular). The more schemas and protocols they have in common, the more
> magic that can happen. DCQ elements hidden behind differing namespace
> containers and schemas would greatly diminish its value.
> 
> For example, I am working with Jean Godby on a catalog of XSLT crosswalks
> (http://errol.oclc.org/schemaTrans.oclc.org.search). Redundant namespaces
> will only clutter it up and make it harder for people to choose an
> appropriate crosswalk from a list when then need one.
> 
> Jeff
> 
> > -----Original Message-----
> > From: Simeon Warner [mailto:simeon at cs.cornell.edu]
> > Sent: Monday, August 09, 2004 7:02 PM
> > To: oai-implementers at oaisrv.nsdl.cornell.edu
> > Subject: RE: [OAI-implementers] Qualified Dublin Core
> >
> >
> > I think the problem with "standard elements in any wrapper" approach
> > is that a harvester has no easy way to know up-front what it might be
> > getting if it harvests records in a particular format. Harvesting a
> > metadata format understood to be "only QDC elements" (nothing else, no
> > funny
> > business) gives rather more assurance of intelligibility to a
> > harvester that understands QDC. A canonical schema seems the simplest
> > way to indicate this (notwithstanding versioning issues mentioned
> > earlier by Pete Johnston).
> >
> > Cheers,
> > Simeon
> >
> > On Mon, 9 Aug 2004, Timothy W. Cole wrote:
> > > Jeff-
> > >
> > > My take is that OAI shouldn't want to get back in business of
> > > hosting schemas or namespaces for metadata formats. We went to some
> > > trouble to
> > get
> > > away from that when transitioning from 1.1 to 2.0. A blessed
> > > application namespaces for qualified DC should be left up to the DCMI.
> > >
> > > While DCMI decides how they want to handle things (and I know they
> > > won't
> > be
> > > quick), solutions like the one at CCLRC (we've done much the same
> > > here
> > for a
> > > couple of projects) and NSDL are fine. Any namespace-aware OAI
> > application
> > > should ignore the locally created namespace and hone in on the dc,
> > dcterms,
> > > and dcmitype namespaces and thereby be able to use those elements
> > without
> > > any problems. If your OAI service provider respects namespaces, it
> > shouldn't
> > > matter what namespace the container element is in -- that's why the
> > > XML Schemas posted on the DCMI Website were done that way.
> > >
> > > Isn't that good enough for harvesting purposes? Are am I missing a
> > subtle
> > > consideration that requires a canonical namespace for the container
> > element?
> > >
> > > Tim Cole
> > > University of Illinois at UC
> > >
> > > -----Original Message-----
> > > From: oai-implementers-bounces at openarchives.org
> > > [mailto:oai-implementers-bounces at openarchives.org] On Behalf Of
> > Young,Jeff
> > > Sent: Monday, August 09, 2004 10:33 AM
> > > To: Mascord, M (Matthew) ; Young,Jeff;
> > > oai-implementers at oaisrv.nsdl.cornell.edu
> > > Cc: LeVan,Ralph; Hickey,Thom
> > > Subject: RE: [OAI-implementers] Qualified Dublin Core
> > >
> > > Yes, this is the problem. Now I have two to choose from. This one,
> > > and
> > the
> > > one created by NSDL. I'm sure there are others out there. For the
> > > sake
> > of
> > > interoperability, it seems to me that the OAI community should bless
> > (and
> > > host?) such an "application profile" schema.
> > >
> > > Jeff
> > >
> > > > -----Original Message-----
> > > > From: Mascord, M (Matthew) [mailto:M.Mascord at rl.ac.uk]
> > > > Sent: Monday, August 09, 2004 11:21 AM
> > > > To: 'Young,Jeff'; oai-implementers at oaisrv.nsdl.cornell.edu
> > > > Cc: LeVan,Ralph; Hickey,Thom
> > > > Subject: RE: [OAI-implementers] Qualified Dublin Core
> > > >
> > > > Hi -
> > > >
> > > > I am the developer of an OAI compatible institutional repository
> > > > for the UK research council CCLRC.  The URL is
> http://epubs.cclrc.ac.uk.
> > > > We are attempting to capture & make publicly accessible any
> > > > scientific research that has benefitted from the use of CCLRC's
> > > > facilities or expertise.  We recently went live on the OAI
> repository
> network:
> > > > http://epubs.cclrc.ac.uk/oai?verb=Identify.
> > > >
> > > > We provide metadata in both Simple and Qualified Dublin Core but
> > > > had the same problem as you in finding an authoritative XML schema
> > > > for Qualified Dublin Core.  In the end we created our own that
> > > > includes the schema defined at
> > > > http://dublincore.org/schemas/xmls/qdc/2003/04/02/qualifieddc.xsd
> > > > and described in the Dublin Core Note at
> > > > http://dublincore.org/schemas/xmls/qdc/2003/04/02/notes/.  This
> > > > defines a container element into which elements from the dcterms
> > > > and dc namespaces may be placed.
> > > >
> > > > I'm not sure if this is the best approach so would appreciate any
> > > > feedback on this.  Our OAI implementation can be tested at
> > > > http://epubs.cclrc.ac.uk/oaitest.
> > > >
> > > > Kind Regards,
> > > > Matthew Mascord
> > > > e-Library Software Developer, CCLRC, UK
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: oai-implementers-bounces at openarchives.org
> > > > > [mailto:oai-implementers-bounces at openarchives.org]On Behalf Of
> > > > > Young,Jeff
> > > > > Sent: 09 August 2004 16:03
> > > > > To: oai-implementers at oaisrv.nsdl.cornell.edu
> > > > > Cc: LeVan,Ralph; Hickey,Thom
> > > > > Subject: [OAI-implementers] Qualified Dublin Core
> > > > >
> > > > >
> > > > > I'm looking for an XML Schema for Qualified Dublin Core for use
> > > > > in OAI repositories. I poked around the UIUC OAI Registry, but
> > > > > all I found was a couple of ad hoc schemas used by repositories
> > > > > that appear to be defunct.
> > > > > Ideally, though, the existence and use of such a schema should
> > > > > be shared across a broad community and not ad hoc.
> > > > >
> > > > > Next, I searched in Google and OAForum but all I found was a
> > > > > reference to a preliminary effort to establish such a schema
> > > > > (http://www.ukoln.ac.uk/metadata/dcmi/xmlschema/). This
> > > > > particular document discusses a sample application schema for a
> > > > > DCQ container, but the implication is that the final schema must
> > > > > be decided by the specific application (e.g OAI?). Apparently,
> > > > > this has never been done.
> > > > >
> > > > > Can someone provide some guidance for doing this?
> > > > >
> > > > > Thanks.
> > > > >
> > > > > Jeff
> > > > >
> > > > > ---
> > > > > Jeffrey A. Young
> > > > > Software Architect
> > > > > Office of Research, Mail Code 710 OCLC Online Computer Library
> > > > > Center, Inc.
> > > > > 6565 Frantz Rd.
> > > > > Dublin, OH 43017-3395
> > > > > www.oclc.org
> > > > >
> > > > > Voice: 614-764-4342
> > > > > Voice: 800-848-5878, ext. 4342
> > > > > Fax: 614-718-7477
> > > > > Email: jyoung at oclc.org
> > > > >
> > > > > _______________________________________________
> > > > > OAI-implementers mailing list
> > > > > List information, archives, preferences and to unsubscribe:
> > > > > http://openarchives.org/mailman/listinfo/oai-implementers
> > > > >
> > >
> > > _______________________________________________
> > > OAI-implementers mailing list
> > > List information, archives, preferences and to unsubscribe:
> > > http://openarchives.org/mailman/listinfo/oai-implementers
> > >
> > >
> > > _______________________________________________
> > > OAI-implementers mailing list
> > > List information, archives, preferences and to unsubscribe:
> > > http://openarchives.org/mailman/listinfo/oai-implementers
> > >
> >
> > _______________________________________________
> > OAI-implementers mailing list
> > List information, archives, preferences and to unsubscribe:
> > http://openarchives.org/mailman/listinfo/oai-implementers
> 
> _______________________________________________
> OAI-implementers mailing list
> List information, archives, preferences and to unsubscribe:
> http://openarchives.org/mailman/listinfo/oai-implementers



More information about the OAI-implementers mailing list