[UPS] RE: uni- and multi-disciplinary settings on ePrints
Tue, 27 Jun 2000 13:04:44 +0100 (BST)
Carl's technical wisdom is most welcome. I repeat, the generic
archive-level subject categorization (partition) is meant to be
a "satisficing" approximation, not an exact and final classification.
The question is (in "minimalism-plus" terms) whether generic
open-archiving software is more likely to be useful and effective NOW
(in getting open archives to proliferate and fill) WITH or WITHOUT a
default, first-approximation set of subject-headers. I don't know the
answer for sure, but my guess is that it is better WITH.
The default subjects can be turned off, or replaced, if the
user/institution wishes (with the corresponding reduction in default
interoperability), but is there any advantage to not providing any
default minimal subject classification at all, for what is meant to be
generic open archive software for any and every discipline, right now?
[My own guess is that we are better off providing a default partition.]
Second, let us not forget that one extremely central function (though
not the only one) for institutional open archives will be
institutional-researcher SELF-archiving. For this, the optimal scenario
of an "information-specialist" doing the classification is not the
right model. There WILL be institutional and individual
SELF-classification for the subject matter of the papers in the
archives. Again, on a minimalism-plus philosophy, why withhold the
default partition, the possibility of turning it off, and the
possibility of adding to it -- just because an optimal means is not
within reach today?
Let authors/institutions self-archive, and self-classify their own
papers, and whatever shortfalls there are can be made up once the
archives achieve their primary and most important objective, which is
to proliferate and fill now!
I also like and admire Carl's philosophy of "virtual collections,"
providing different higher-level blends and compilations from the
overall corpus. But right now we are talking about the primary corpus
itself, and the default taxonomy it provides for self-describing
itself. Let us not (in the interests of "absolute minimalism") mute its
self-describing vocabulary because it is not optimal, and wait for
meta-level collections and services to furnish it instead; as a
compromise, the self-description fields can always be ignored by the
higher-level collection. But let there be a default set now.
The mother of all "collections" is the super-set of primary open
archives itself. That generic set is what we are trying to create,
using a "minimalism-plus" philosophy, avoiding both the Scylla of
Absolute Minimalism (offering less functionality than is feasible
today) and the Charybdis of Optimalism (which means holding back from
open archiving today, to await the ideal implementation some day).
On Tue, 27 Jun 2000, Carl Lagoze wrote:
> I have to jump in and say that IMHO there is a mistaken focus on
> partitions as the proper means of segmenting up the information space
> (of eprint archives, of OAI, of anything).
> First, a little history which I meant to say in San Antonio but never
> got around to it. The idea of partitions was a direct result of the
> beginning of our (NCSTRL) collaboratioin with LANL a few years ago. At
> that point (and still has) LANL had this, in my opinion, somewhat
> misguided and non-scalable legacy notion of fixed partitions in its
> archive. They wanted to make these partitions visible at the protocol
> level and thus was born the notio of repository parititions in Dienst.
> (Paul and Simeon, please understand that I'm not meaning to disparage
> your work or arXiv. As noted below the partition concept is just fine
> for your application!)
> The Intention! -- These were and still are purely intended as a
> repository local, administrative convenience. Basically a very simple
> way of dividing up an individual repository and certainly not meant as a
> means for partitioning up some larger information space.
> I have never felt very comfortable with this whole idea esp. the way
> that it is implemented at LANL - e.g., authors decide which partition a
> paper should be placed and users search within partitions. This works
> (maybe) just great in a closed and highly expert community such as those
> who use the LANL archive but breaks down badly in other communities.
> Esp. at the user end where searching within a partition makes little
> Extending this notion across repositories/archives really starts to
> break down. We seen this confusion in the OAi discussions. All the
> sudden we're trying to figure out what is the right way to
> "universalize" partitions? What is the way of registering partitions?
> What do these partitions mean anyway?
> The Reality! -- There is no "right" way to partition information spaces
> (just as there is no "right" metadata). There are many ways to
> partition information spaces that are customized to different user
> groups. Furthermore, partitioning of information spaces is completely
> independent of archive location (e.g., the set of information in a
> partition may some content from repository A, all from repository B,
> some from repository zz, etc.). So, mapping individual respository
> partitions to a global or even intranet cross-repository partitioning
> system breaks down due to 1) projecting local decisions to global
> decisions and 2) ignoring the fact that any one document in a repository
> should be able to exist in more than one global partition.
> Solution? -- Back in '98 I wrote
> http://www.dlib.org/dlib/november98/lagoze/11lagoze.html, in which I
> talked about a collection abstraction in distributed information spaces.
> At this point we implement such an idea in Diesnt as the means of
> creating the NCSTRL collection that spans multiple repositories. Our
> implementation is imperfect but I maintain is on the right track. Over
> the next year we have funding and people to push this to the next and
> hopefully correct implementation that will allow organizations and
> instititions to create flexible collections that do (hopefully) scale
> across multiple repositories and make it possible to aggregate documents
> for multiple communities.
> For now, please lets not try to push the partition thing beyond its
> original goal or a merely repository local administrative convenience.
> Finally, Rob and Stevan, please understand that I'm not trying to
> criticize the work you've done on your eprint software. I'm really
> looking forward to seeing it in action and working with you on the idea
> of overlaying more features of the Dienst protocol on it as we try to
> scale from individual archives to federated information spaces.
> > -----Original Message-----
> > From: Stevan Harnad [mailto:firstname.lastname@example.org]
> > Sent: Tuesday, June 27, 2000 6:56 AM
> > To: Robert Tansley
> > Cc: Eric F. Van de Velde; 'Stevan Harnad'; john.ober@UCOP.EDU;
> > ken.weiss@UCOP.EDU; Carl Lagoze; Ed Sponsler
> > Subject: uni- and multi-disciplinary settings on ePrints
> > Dear Eric, Ken et al:
> > The question of whether it will prove optimal for University Open
> > Archives to be pluridisciplinary or unidisciplinary can be settled by
> > actual practise.
> > The ePrints archiving software is designed to be useable either way:
> > Part of the local institution's parameter-setting and customization of
> > the generic ePrints software can amount to turning other disciplines
> > off if it is being used for just one department (or lab, or
> > researcher).
> > Also, there should be a generic spectrum of discipline partitions that
> > ePrints provides as a default (we are still looking for the optimal
> > default one to use, and recommendations are welcome!), and then these
> > can be added to. To preserve overall interoperability, it
> > would be best if
> > such site-specific additions to an expanding open-partition
> > space could
> > be percolated to all open-archives in some systematic way (but this is
> > a technical issue that exceeds my own technical grasp!: Carl?)
> > What is certain is that, again, the philosophy of "minimalism plus"
> > should prevail: We must not hold back, waiting for a final, ultimate,
> > optimal solution, requiring more complicated compliance by
> > individuals.
> > Find an approximation that will "satisfice" to launch, fill, and bring
> > up-to-speed a large number of universities' open archives right now.
> > THEN the collective commitment that comes with having all those
> > institutions' intellectual goods already minimum-plus-functional in
> > the interoperable open-archives will ensure that the functionality
> > grows, and that the growth comes in an already-shared collective
> > convention.
> > So: "Satisficing" approximate partitions for now, optimizing
> > for later,
> > once the open-archiving is irreversibly in motion.
> > Cheers, Stevan
> > On Tue, 27 Jun 2000, Robert Tansley wrote:
> > > "Eric F. Van de Velde" wrote:
> > > >
> > > > Stevan, Rob,
> > > > The tech guru for our preprint service (currently
> > consisting of NCSTRL) is
> > > > Ed Sponsler. He is in today and tomorrow, but then takes
> > a (well-deserved)
> > > > vacation. So, it may take a bit before we get into this.
> > > >
> > > > However, I believe we may have similar issues. Until now,
> > the primary usage
> > > > of Dienst has been within the NCSTRL context. We are
> > struggling with
> > > > decisions on how to implement a Caltech-wide
> > cross-disciplinary archive.
> > > >
> > > > Do we really have only one Caltech-wide archive with
> > partitions for
> > > > individual options (departments). However, can these
> > partitions easily
> > > > participate in disciplary federations?
> > >
> > > This is an issue that hasn't fully been resolved by the
> > open archives
> > > initiative, and is in fact the main issue I raised at the
> > OA workshop in
> > > San Antonio. I will certainly be pushing to get this resolved.
> > >
> > > > Another option is to create a repository for each
> > department and combine
> > > > them through federation into a Caltech repository.
> > >
> > > This does sound like a better option to me, as it would
> > ease some of the
> > > difficulties involved in disciplinary federation (actually
> > harvesting in
> > > the OA world.) Additionally, if individual archives are
> > smaller, this
> > > does tend to improve their individual performance.
> > >
> > > As well as the departmental archives, you could quite easily have a
> > > Caltech "gateway" search engine, that could create an index
> > covering all
> > > of the departmental archives, and search them all in a very
> > efficient
> > > way. This separation of services (such as searching) and
> > data provision
> > > brings many benefits.
> > >
> > > > Occasionally, even the option of creating a repository
> > for every faculty
> > > > member is mulled over, because there are quite a number of
> > > > "independence-minded" faculty in this place.
> > > > Question though is whether the federations remain
> > manageable under such a
> > > > scenario...
> > >
> > > You could allow each department a degree of freedom. For
> > example, using
> > > the EPrints software, each department's archive could be given the
> > > departmental "look and feel", if they have one. Additionally the
> > > software allows each department to hold their own extra information
> > > about documents (for example, "funding body"). Provided each archive
> > > supports the open archives protocol, and provides the same central
> > > metadata, the distributed searches performed by the Caltech search
> > > gateway are not affected.
> > >
> > > R
> > >
> > > > --Eric.
> > > >
> > > > -----Original Message-----
> > > > From: Stevan Harnad [mailto:email@example.com]
> > > > Sent: Thursday, June 22, 2000 9:02 AM
> > > > To: Eric F. Van de Velde
> > > > Cc: Rob Tansley
> > > > Subject: RE: EPrints Software Beta (fwd)
> > > >
> > > > Hi Eric,
> > > >
> > > > The link will come to you shortly, from Rob Tansley.
> > Meanwhile see:
> > > > http://www.eprints.org/software.html
> > > >
> > > > Chrs, Stevan
> > > >
> > > > On Thu, 22 Jun 2000, Eric F. Van de Velde wrote:
> > > >
> > > > > Stevan,
> > > > > I would definitely be interested to take a look at
> > this. Did you mean to
> > > > > include a link in your e-mail? I did not find a link to
> > the Beta on the
> > > > > cogprints site.
> > > > > --Eric.
> > > > >
> > >
> > > --
> > > Robert Tansley Tel: +44 (0) 23 80594492
> > > IAM Research Group Fax: +44 (0) 23 80592865
> > > Electronics & Computer Science
> > http://www.ecs.soton.ac.uk/~rht96r/
> > > University of Southampton
> > > Southampton SO17 1BJ, UK
> > >
> UPS mail list
> Mail submissions to firstname.lastname@example.org
> To subscribe or unsubscribe visit http://vole.lanl.gov/mailman/listinfo/ups