[OAI-implementers] RE: RDF, OAI, and application within Libraries

lagoze@cs.cornell.edu lagoze@cs.cornell.edu
Fri, 18 May 2001 07:44:32 -0400


Dave,

A few brief comments.  Please don't overly interpret my comments as
"OAIs positioning".  I am but one of several voices in the OAI
"think-tank" ;-)  Certainly Herbert Van de Sompel's thinking on this is
as important as mine - I'm confident that he agrees with my broad points
but am uncomfortable having my exact words taken as the OAI gospel.  

That aside, I think that you and I are pretty much on the same page
about all this.  Especially about the long-term need to support
semi-structured data formats and incompatibility of the basic goal of
XML schema vis-a-vis that.  Ditto for the importance of RDF in all of
this.  The big problem, as I see it, is getting people to even
understand metadata at all and why any structuring of data is important.
Its been amazing for us at Cornell who are involved in the National
Science Digital Library project to see how many respectable content
providers out there simply don't understand the basics about metadata.
My personal experience within DCMI is that mixing this up with scarey
things like nodes, arcs, namespaces, etc. defeats the goal.  My personal
philosophy, which may be wrong, is the best way to get the message out
to as many neophytes as possible is via a simple cookbook rather than a
list of options.  Others may differ on this approach.

Regarding your data/service provider discussion.  Yes, I agree that an
individual party can play both roles, as you point out.  In fact, in our
above mentioned NSDL project we are doing exactly what you are saying -
pulling in or using heuristics to extract very sloppy metadata from
"level 0" data providers, processing it with computers and humans and
then exposing via OAI as "normalized" metadata.  This two-tier structure
of data providers is very interesting to us in the sense that it
provides a hierarchy for metadata enhancement - certainly an environment
in which RDF plays an important role.

Carl

> -----Original Message-----
> From: Dave Reynolds [mailto:der@hplb.hpl.hp.com]
> Sent: Thursday, May 17, 2001 12:54 PM
> To: lagoze@CS.Cornell.EDU
> Cc: pbreton@mit.edu; bass@mit.edu; www-rdf-dspace@w3.org;
> bwm@hplb.hpl.hp.com; oai-implementers@oaisrv.nsdl.cornell.edu
> Subject: Re: RDF, OAI, and application within Libraries
> 
> 
> Carl,
> 
> Many thanks for this - it helps me understand OAI's 
> positioning much better.
> 
> Let me paraphrase and respond to your points.
> 
> > [1] It is plausible to support non XML Schema means of 
> validating metadata
> passed back from OAI queries but [2] too many options actually limits
> flexibility by being too complex.
> 
> Agreed. You are right there are a lot of alternatives (Relax, 
> Schematron,
> RDFS, DAML ...) and supporting everything it just too complex 
> and costly. In
> my naive view of the world there are two useful extremes to support.
> 
> Firstly, one wants a tightly constrained validatable format suited to
> typical metadata records. For that, XML Schema seems entirely 
> appropriate
> and it's not obvious that going further and supporting Relax 
> or whatever
> adds much.
> 
> Secondly, I believe that for future proofing we also need to 
> support less
> formal semi-structured data formats. Semi-structured data 
> arises in many
> ways - merging of multiple data sources, sparse user 
> annotations, rapidly
> evolving schemas specific to given communities. In my simple 
> view of the
> world RDF is a good foundation for a wide variety of 
> semi-structured data so
> supporting that would be a useful complement to the highly structured
> formats.
> 
> If this meant changing OAI to support an extra format I can 
> see that as an
> extra complication. However, by using the sort of shallow (XML Schema
> compatible) encoding we were discussing in this thread it 
> seems like RDF
> could be supported as an incremental addition, within the 
> current framework,
> and need not complicate implementations. There is the issue of using
> RDFS/DAML schemas  for interpreting the RDF when you get it. 
> However, even
> these need not affect the OAI protocol since they are (a) 
> somewhat optional,
> (b) can sometimes be inferred out of band e.g. from the 
> namespaces used in
> the RDF properties, (c) could be included in the RDF payload.
> 
> To me these two extremes are sufficient between them to cover 
> most needs.
> 
> >   An outstanding
> > question for me is to understand where RDF sits in the OAI 
> data provider
> > and service provider dichotomy.
> 
> An excellent point. Perhaps I am confused on the distinction 
> between data
> providers and service providers but I can see repositories like DSpace
> wanting to combine metadata from multiple sources and "smush" 
> them together
> and also to support less structured metadata such as user or 
> small community
> annotations. Their data provider role makes it appropriate 
> for them to use
> OAI to export metadata access but they also have attributes 
> of a service
> provider, or at least a clearing house for various metadata, 
> and so have
> some role for these flexible semi-tructured metadata formats.
> 
> Dave
> 
> 
>