[OAI-implementers] Re: SetSpec RegExp

Ed Sponsler eds@library.caltech.edu
Wed, 29 Aug 2001 09:57:34 -0700


How about generating the key based on the output of:

$ date +%s

(number of seconds since Jan. 1, 1970)?

And if the key needs to be globally unique, you could prefix the
repositoryID. This would keep the size manageable, keep the key completely
isolated from the user and ensure it's uniqueness.

Thus a hierarchy for the repository 'caltechASCI' may look like this:

SetName					SetKey

Engineering					caltechASCI:999103748
Engineering:Aeronautics			caltechASCI:999103748:999103786
Enginnering:Applied Mathematics	caltechASCI:999103748:999103840
Physical Sciences & Mathematics	caltechASCI:999103868

=-=-=-==-==-
Ed Sponsler, Caltech Library System

> Date: Wed, 29 Aug 2001 02:09:11 +0100
> From: ePrints Support <support@eprints.org>
> To: simeon@cs.cornell.edu
> Cc: ePrints Support <support@eprints.org>,
>    OAI-implementers@oaisrv.nsdl.cornell.edu
> Subject: Re: [OAI-implementers] SetSpec RegExp
> 
> I was planning to make the eprints code able to use any
> field for OAI sets, not just "subjects" like it does now.
> 
> The intention being able to export, for example, each author
> as a set.
> 
> If you do set a maximum length, I'd hope it was pretty large,
> like over a "k", I'm not saying it needs to be, but arbitary
> restrictions make me edgy. 
> 
> Something like 4096bytes (Is that the legal max for a URL?) 
> would be more than enough. But I reckon more than enough is
> better than just enough. It is quite possible to imagine 
> someone using the MD5 of something as the set tags (or 
> whatever) and once you got 5 deep it would start to get 
> really long.
> 
> OK , I'm being over paranoid, but I was brought up with
> people quoting the old "640k should be enough for anybody"
> story at me.
> 
> > On Tue, 28 Aug 2001, ePrints Support wrote:
> > > (if this message appears 3 times, sorry, I kept sending
> > > it from the wrong account)
> > >
> > > Argh. I've been working on a minor upgrade to eprints 1.1
> > > to bring it "up to code" with regards to OAI1.1 and I just
> > > discovered that the SetSpec only allows a-zA-Z0-9 and : as
> > > a seperator.
> > >
> > > Our standard default sets use '-' all over the place.
> > >
> > > I'm looking at encoding the setspecs as hex strings 0-9A-F
> > > so "A" is encoded as "41" etc. This way I can even use UTF-8
> > > which means I can do some very interesting things...
> > >
> > > This _will_ mean that people running eprints will have all
> > > their OAI setspec's change. But seeing as their current ones
> > > are illegal, that's not a big problem.
> > >
> > > A bigger problem is that where we currently have bio:bio-ani-behav
> > >
> > > we now have:
> > > 62696F:62696F2D616E692D6265686176
> > > which is less human-readable. Does that really matter as it's just
> > > a key?
> > >
> > > Comments please!
> > 
> > _______________________________________________
> > OAI-implementers mailing list
> > OAI-implementers@oaisrv.nsdl.cornell.edu
> > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers
> 
> -- 
> 
>  Christopher Gutteridge                   support@eprints.org 
>  ePrints Technical Support                +44 23 8059 4833