[OAI-implementers] Re: OAI sets as new instances (Sets Proposal (from DLF))

Dr Robert Sanderson azaroth at liverpool.ac.uk
Mon Apr 25 16:25:36 EDT 2005

On Fri, 22 Apr 2005, Thomas G. Habing wrote:

> time articulating.  Perhaps the problem is that there are several different 
> issues with sets, and I'm not sure which of these we are really trying to 
> address.
> 1) The tendency of people to misunderstand sets as a sort of poor man's 
> search.

I think that by moving the set name into the URL it doesn't get rid of 
this, but it does lessen the tendancy to think this way.  When it's a 
parameter in the query, it's easy to cram any arbitrary value in there. 
It's less intuitive to do this when the set name is part of the URL.

> 2) Technical issues relating to how to signal that a record has been moved 
> out of a set, but has not been deleted from the repository.

This wasn't something I was thinking of when writing it up, but it does 
fall out neatly from the proposal -- you simply set them deleted in the 
set repository.

> 3) How best to describe a set: there is a technical description such as how 
> many items are in the set and what the updated frequency is.  There is also 
> the conceptual description, such as the records in this set are all described 
> by this subject heading, or they all belong to this "collection," or they all 
> have this publishing status.

The advantage here is that you have all of the best practices and schemas 
for the Identify verb for the set descriptions. What exactly 
to put in here is still in need of work, but I think it's a good start to 
allow the full Identify information.

> 4) Issues such as whether its a good idea to have overlapping sets, flat 
> sets, hierarchical sets, and in which circumstances.

Whether it's a good idea? I'm not going to comment on that, besides the 
point that there are heirarchical collections and sub-collections, so it's 
natural to describe these in a hierarchical tree of sets.
The main advantage here is that everything falls out neatly -- if you want 
a tree, then design your URLs to be a tree.  If you want overlapping, flat 
or any other design, then it's up to the design of the URL paths, not the 
protocol to try and fit all of the requirements.

> 5) Variations in how different implementers have interpretted the OAI 
> "data model".

I don't think that the proposal addresses this.

> Briefly some of my misgivings:
> Does Rob's model place an excessive burden on data providers, or service 
> providers?

The burden on the data providers can be done in at least two different 
ways -- either multiple instances of the script, or one server which 
handles everything.  Multiple instances is easier than the status quo (no 
sets, no extra URLs).  One server is as hard as the status quo, but 
depending on the underlying architecture it may be no more difficult, or 
it may be quite a bit harder (at which point, there's always multiple 
instances of the server code)

For service providers, it should be easier, as they can simply follow the 
links in the <friends> section, rather than having to construct parameters 
from the listSets response.

> Does it fundamentally alter the underlying data model of OAI, for better or 
> worse?  Previously, I think that items belonged to one or more sets, and 
> records were disseminations of these items in a specific format.  I think 
> Rob's model alters this to something like records being disseminations of 
> items within the context of those items being contained in a particular set.

Mmmmm. I have no real comment here.  There's nothing to prevent you from 
having different representations of the same object disseminated in 
different sets, but that's no different to today where some providers make 
sets available per record schema.

I think that's a best practice issue which should be addressed, but is 
mostly orthogonal to the proposal?


       ,'/:.          Dr Robert Sanderson (azaroth at liverpool.ac.uk)
     ,'-/::::.        http://www.csc.liv.ac.uk/~azaroth/
   ,'--/::(@)::.      Dept. of Computer Science, Room 805
,'---/::::::::::.    University of Liverpool
I L L U M I N A T I  Cheshire3 IR System:  http://www.cheshire3.org/

More information about the OAI-implementers mailing list