[UPS] Problems/Comments with Santa Fe Metadata Set

Michael L. Nelson mln@blearg.larc.nasa.gov
Wed, 17 Nov 1999 15:07:19 -0500 (EST)


I agree with most of Mark Doyle's points.  I think the whole issue can be
summed up with what is in Carl's draft:

	- the display id field is mandatory and repeatable

1. It is mandatory because of the intent: given that this metadata will
travel far and wide, there must be a link back to some reasonable
representation of the "data" (skipping the discussion of the semantics
of "data" and "metadata").  

2. It is repeatable because there may be mutliple unique pointers to the
data.  I can use:

http://xxx.lanl.gov/abs/cs.CL/9911006
	which provides access to a variety of formats 

http://cs-tr.cs.cornell.edu:80/Dienst/UI/1.0/Display/xxx.cs.CL/9911006
	which provides access to 2 formats
	
http://cs-tr.cs.cornell.edu:80/Dienst/UI/2.0/Describe/xxx.cs.CL/9911006
	which, surprisingly, provides access to 0 formats.

as Marks says:

>but each arXiv has a very good sense of which URL is
>potentially the most useful to end users

which of the above 3 URLs one would get for the metadata probably depends
on where you harvested: LANL or Cornell.   

Carl writes:

> The display ID metadata element presumes that not only does the repository
> or digital object know about these URLs but endows one with the property of
> being the "correct" one (a rather wrong concept since the display ID for an
> Italian audience should be different than for a US audience).

to which I join Mark in strongly disagreeing.  Its not a matter of which
is the "correct" URL, but rather which is the "favorite" or
"best-guess" URL.  The manner in which an archive computes, constructs or
otherwise arrives at this URL (or URLs) doesn't really matter.  

If we want, for example, to provide additional interfaces to the data
(e.g. foreign language, subject specific, site specific) we use the Object
ID field (which is also repeatable).  To continue with the above example,
the Object Id field in the metadata would be "xxx.cs.CL/9911006"  (and/or
"cs.CL/9911006", or a bibcode, or any number of discipline- or
archive-scoped unique ids).  If my DL harvested this metadata, but wanted
to provide a value added interface to the object, it would be built around
this Object Id and would be free to ignore the Display Id.  It is this
area, IMO, that DLs will distinquish themselves over their "competitors"
in an environment where every DL can harvest from the same universal
corpus.

In summary, I see this whole thing as a non-issue, and think that:

http://www.cs.cornell.edu/lagoze/External/UPS/SFMeta.htm

as it stands is sufficient.

regards,

Michael

---
Michael L. Nelson             
NASA Langley Research Center  m.l.nelson@larc.nasa.gov
MS 158, Hampton VA 23681      http://home.larc.nasa.gov/~mln/
+1 757 864 8511               +1 757 864 8342(fax)