[OAI-implementers] Sets in OAI-PMH and DSpace

Simeon Warner simeon@cs.cornell.edu
Tue, 21 Oct 2003 15:51:34 -0400 (EDT)

Interesting scenario Rob, and I think your description is sound. In the
OAI context the set structure really doesn't have a notion of set "C":
there are only sets "A:C" and "B:C", strict subsets of "A" and "B", which
happen to have the same content.

The other option is to put everything at the first level and have sets:
"ComA", "ComB" and "ColC" where it just happens that all items in "ColC"  
are also in "ComA" and "ComB". (I think this is what Thom meant by a
unique id for each collection)

As you say, sets are for selective harvesting (not for classification per
se), so the choice should be motivated by the likely usefulness for


On Tue, 21 Oct 2003, Tansley, Robert wrote:
> > The set structure in OAI is very simple, but also has almost 
> > complete flexibility, so I'm sure you could encode any 
> > relationships that DSpace is aware of in them.  But the 
> > retrieval on sets is quite limited, so it isn't clear what 
> > good it would do.  
> > 
> > I agree with Hussein -- keep them simple.  For DSpace, 
> > possibly a unique ID for each collection.
> My understanding was that sets could be flat or hierarchical; presumably
> this means a strict hierarchy, i.e. no node could have >1 parent -- is
> this correct?  If so, DSpace could not expose the case where a
> Collection appears in two Communities, since the same Collection would
> have two setSpecs.  However, thinking about it, maybe this is actually
> OK, since that Collection would effectively be two OAI sets with two
> separate setSpecs; for selective harvesting purposes, harvesters don't
> necessarily need to know that the two sets are in fact the same
> Collection.
> Here's a quick example in case this isn't clear... Collection C is
> contained in Community A and Community B:
> Community A      Community B
>         \          /
>          \        /
>           \      /
>          Collection C
> The exposed OAI set structure would be:
> setSpec     setName
>  A          Community A
>  A:C        Collection C
>  B          Community B
>  B:C        Collection C
> Is there any reason why the above might be 'illegal' in OAI-PMH?  Might any harvesters get confused?
> P.S. sorry if cross-posting to dspace-tech & oai-implementers caused any duplication weirdness... I for one seemed to get about 6 copies of replies, I don't know whether that was the mailing lists or my client getting confused though!
>  Robert Tansley / Hewlett-Packard Laboratories / (+1) 617 551 7624 
> _______________________________________________
> OAI-implementers mailing list
> List information, archives, preferences and to unsubscribe:
> http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers