[OAI-implementers] XSD file for qualified DC

Carl Lagoze lagoze@cs.cornell.edu
Thu, 20 Jun 2002 07:58:01 -0400


Ann,

Thanks for the clarifications here.  Yes, I understand the overloading
of the term "citation".  My colleague Donna Bergmark here at Cornell in:

Bergmark, D. and Lagoze, C., "An Architecture for Automatic Reference
Linking," presented at 5th European Conference on Research and Advanced
Technology for Digital Libraries, Darmstadt, Germany, 2001, 

Was much more systematic in calling the in-links in the link graph
"citations" and the out links "references"; in that sense we should then
really talk about "citation data" as your category one below (the
bibliographic information for the resource itself) and "reference data"
as your category two below (bibliographic information for the resources
referenced by the resource).

Using this terminology I think we all agree that putting reference data
into Dublin Core is not right.  This is very much a "one-to-one"
violation in that it would involve putting metadata about another
resource into the metadata container of some source resource.  Thus,
there is a clear application of some parallel metadata form to expose
the reference data; probably following the openURL, bison-fute concepts
that Herbert has outlines.

Turning attention to the citation data issue, I will argue equally
strongly that slotting these into the dc identifier element is
inappropriate.  Citation data is implicitely structured whereas dc
elements should be simply "appropriate literals" as defined by Tom
Baker.  Playing a syntactic trick and serializing that data into an
"appropriate literal" through the use of punctuation such as "Library
and Information Science Research 22(3), 311-338 (2000)" as you suggest
in http://epub.mimas.ac.uk/DC/citproposal.html, seems ill-advised with
data that screams out for markup such as:

<citation>
	<journalTitle>Library and Information Science
Research</journalTitle>
	<journalVolume>22</journalVolume>
....
</citation>

Since this explicit structure is not currently allowed in DC (and I
question whether it ever should be) and given the fact that OAI-PMH is
quite happy expressing parallel structured form, it might be time to
write the schema for such citation data and encourage people to expose
it for harvesting, and not characterize it as "dublin core".

Carl

> 
> On Wed, 19 Jun 2002, Ann Apps wrote:
> 
> > Herbert,
> >
> >
> > I agree entirely with your suggestion about using OpenURL as a 
> > parallel metadata format.
> >
> >
> > However, as the question which started this was about 
> qualified DC, I 
> > would like to point out that there may be some confusion about the 
> > meaning of 'citation', especially about the DC-Citation 
> stuff, which 
> > has also been referred to as connected with OpenURL by the Ariadne 
> > paper (http://www.ariadne.ac.uk/issue27/metadata/). A 
> confusion which
> > probably wasn't helped by my earlier email.
> >
> >
> > The term 'citation' is used to describe 2 similar but different 
> > things. It is easiest to desribe this for journal articles.
> >
> >
> > 1. The bibliographic citation information (journal, issue, 
> pagination) 
> > for an article as part of the metadata for the article 
> itself. This is 
> > what publishers refer to as the header information for the article.
> >
> >
> > 2. The citation information for papers cited by an article 
> which are 
> > listed in the references section of the article.
> >
> >
> > The DC-Citation work is, so far, about (1). Maybe the choice of the 
> > term 'citation' was unfortunate, because everyone assumes it means 
> > (2), but it's difficult to think of a better word. This is why the 
> > encoding suggested for dc-citation is within a 
> dc:identifier element, 
> > because of the recognition that the bibliographic citation can 
> > effectively identify the article. [This could obviously be 
> > extrapolated to (2) but would be within a 
> > dc:relation/dcterms:references element.]
> >
> >
> > The scenario you describe is for citation (2). Here the parallel 
> > metadata format within a context object you describe looks perfect. 
> > This is obviously a major OAI requirement, for initiatives such as 
> > Citebase.
> >
> >
> > But I think that citation (1) will also be needed as OAI is 
> used for 
> > more than just eprints repositories. For instance, if you wanted to 
> > provide OAI records from an A+I database, or a journal 
> article table 
> > of contents database, you would need to be able to detail the 
> > journal/issue information within each record. I could see 
> this being 
> > of use for harvesting records for the latest journal issues 
> available 
> > in such a service. I think you can still use the OpenURL 
> metadata for 
> > this but that it would be 'nested' within the DC record, similar to 
> > the noddy example I previously wrote. At the moment we're 
> still stuck 
> > with using unrecognised DC structured values in literal 
> strings within 
> > simple DC to pass this information around.
> >
> >
> > But at present, I think that the OAI priority is citations(2), and 
> > this current development looks really promising. Citations(1) will 
> > need more discussion within DC.
> >
> >
> > Best wishes,
> >
> > 	Ann
> >
> >
> >
> > On Tue, 18 Jun 2002 herbert van de sompel wrote:
> >
> >
> > <color><param>7F00,0000,0000</param>> 1. In the context of the 
> > OAI-PMH, it would make a lot of sense to
> >
> > > treat citations as a parallel metadata format.  The unqualified DC
> >
> > > record describes the "paper", whereas another record 
> (under the same
> >
> > > item) describes all the citations made in the "paper".  
> That is what
> >
> > > Carl suggested in his mail.  And that is the approach that Stevan
> >
> > > Harnad and I discussed at last year's OAI-related conference in
> >
> > > Geneva.  This approach makes sense in that it is extensible: it 
> > > allows
> >
> > > other stuff related to the "paper" (for instance usage logs,
> >
> > > certification metadata, preservation metadata, etc.) to 
> be treated 
> > > in
> >
> > > yet other parallel records under the same item.
> >
> > >
> >
> > > 2. When it comes to choosing a "metadata format" to describe those
> >
> > > citations, looking at OpenURL makes a lot of sense.  Not only 
> > > because
> >
> > > it is becoming a standard, but because its purpose really IS to
> >
> > > describe stuff (read "citations" in this context) by building on a
> >
> > > broad range of identifier-namespaces and a multitude of metadata
> >
> > > formats.  Moreover, OpenURL allows not only for the 
> description of a
> >
> > > "citation" but (optionally) also of entities that make up the 
> > > context
> >
> > > in which the "citation" appears.  That is very significant when
> >
> > > thinking about the possibility of open linking at the level of OAI
> >
> > > service providers. And it is significant when thinking of using
> >
> > > "OpenURL" as a parallel metadata format, as it allows the 
> citation 
> > > to
> >
> > > remain attached to the thing in which it is cited.
> >
> > >
> >
> > </color>[...]
> >
> >
> > <nofill>
> > 
> ----------------------------------------------------------------------
> > ----
> > Mrs. Ann Apps. Senior Analyst - Research & Development, MIMAS,
> >      University of Manchester, Oxford Road, Manchester, M13 9PL, UK
> > Tel: +44 (0) 161 275 6039    Fax: +44 (0) 0161 275 6040
> > Email: ann.apps@man.ac.uk  WWW: http://epub.mimas.ac.uk/ann.html
> > 
> --------------------------------------------------------------
> ------------
> > _______________________________________________
> > OAI-implementers mailing list
> > OAI-implementers@oaisrv.nsdl.cornell.edu
> > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers
> >
> 
> 
> _______________________________________________
> OAI-implementers mailing list OAI-implementers@oaisrv.nsdl.cornell.edu
> http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers
>