[OAI-implementers] Problems with defining sets

Simeon Warner simeon@cs.cornell.edu
Wed, 16 Jan 2002 10:40:40 -0500 (EST)

On Wed, 16 Jan 2002, Akbar, Gul wrote:
> Would somebody be able to supply some information (or provide a link to some
> existing information) that explains how to set up sets. I have read the
> protocol document but it doesn't explain how sets are implemented.

The protocol document makes no comment about how sets might be
implemented by a data provided.

Within the protocol setSpecs are 'labels' which may be used to identify
groups of items within the repository. These labels may then used to
restrict certain operations (ListIdentifiers and ListRecords) to those

> As far as I can make out a set could just be a directory. However, if a
> record belongs in more than one set (eg the sets are grouped by institutions
> (as in the protocol) AND subject then there is the potential for records to
> be duplicated - making it difficult to ensure that all records about a
> particular resource are synchronised.

OAI-PMH does not define any 'subject' construct. One could implement that
with sets though.

Items may belong to zero, one or more sets. It would seem inadvisable
to use directories to internally represent sets if you want to allow items
to belong to more than one set unless it is a simple hierarchy. However,
that is an implementation decision and thus out of scope of OAI-PMH.

> The protocol states that the "repository's set hierarchy is represented in
> the protocol via setSpecs". However, it dosn't state how to define a
> setSpec. Any help on this topic would be greatly appreciated.

setSpec is defined at the top of section 2.5:

  setTag -- a non-space separated string of alphanumeric characters;
  setSpec -- a colon [:] separated list of setTags of each node on
  the path leading from a root element to the actual node;

This is really just a definition of syntax. "The actual meaning of a Set
or of the arrangement of Sets in a repository is not defined in the

In my OAI tutorial (http://library.cern.ch/HEPLW/4/papers/3/) section 3.5
I wrote:

Sets are provided as an optional construct for grouping items to support
selective harvesting [protocol doc section2.5]. It is not intended that
they should provide a mechanism by which a search is implemented, and
there is no controlled vocabulary for set names so automated
interpretation of set structure is not supported. It should be noted that
sets are optional both from the point of view of the data-provider - which
may or may not implement sets; and the service-provider - which may ignore
any set structure that is exposed. It is not clear whether sets will be
widely used and I shall not consider them further in this tutorial.

Hope this helps,

> Thanks in advance,
> Gul Akbar.
> *********************************************************************
> The information contained in this e-mail is confidential and may be
> legally privileged. It is intended for the addressee(s) only. If you
> are not the intended recipient, please delete this e-mail and notify
> the postmaster@bl.uk : The contents of this e-mail must not be
> disclosed or copied without the sender's consent.
> The statements and opinions expressed in this message are those of
> the author and do not necessarily reflect those of the British
> Library. The British Library does not take any responsibility for
> the views of the author.
> *********************************************************************
> _______________________________________________
> OAI-implementers mailing list
> OAI-implementers@oaisrv.nsdl.cornell.edu
> http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers