[OAI-implementers] User Specific Archive Access

Hussein Suleman hussein@cs.uct.ac.za
Wed, 23 Apr 2003 11:17:21 +0200


some late comments:

firstly, i think we should address a philosophical/theoretical issue 
that is hinted at by this problem. OAI-PMH is based on the premise that 
interoperability can best be promoted by shifting the "implementation 
burden" from the data providers to the service providers - making those 
who have a greater desire for interoperability pay the costs in terms of 
complexity. JSTOR appears to offer precisely the inverse scenario to the 
classical data/service provider split. if i am reading the thread 
correctly, JSTOR is the driving force behind this interoperability 
effort and, by my understanding, should therefore centrally handle the 
complexity and offer subscribing libraries the simplest possible 
interface (OAI-PMH?).

that said, in the ideal case, a subscribing institution should get a 
cohesive view of their subcollection, independently of other subscribers.

how could this work in practice?
- do you need virtual data providers? i am not sure this is necessary - 
you should be able to use IP- or some other authentication and determine 
what data to make visible transparently
- do you need to store additional data for each harvester? i hope not, 
as this will break some of the basic idempotence properties of OAI-PMH. 
if each record in your archive has "published" and "modified" dates, you 
could screen for accessible subsets on the basis of matching the 
published dates to subscription rules (on a per access basis of course), 
while allowing date-based harvesting on the basis of modification dates 
(with the provision that modification = max (modification, 
subscription)) ... i hope this makes sense :)
- unsubscriptions are going to be tricky! if you expose metadata 
differently for different users, "deletions" may become a nightmare, so 
if possible i would suggest looking into not using the PMH's deletions 

in any event, i think it is doable with an appropriately structured 
database, with a not-too-complex set of subscription rules and without 
additional storage or per-harvester data.


Michael Krot wrote:
> Hi all,
> Thanks to all for your input - it is very nice to get worthwhile 
> feedback so quickly!  I realize that access restriction is beyond the 
> scope of the OAI standard, but it is unforntunately a messy part of life 
> over here. I'm not so interested in the how of access restriction, ip 
> recognition, controlling rights, etc. - these things I can handle.  The 
> interesting part to me is managing this huge metadata repository in such 
> a way as to provide the metadata I want a user to see given the 
> constraints of the OAI standard.
> I will try to address some of the questions you all have raised:
> 1)  Mr. Krichel made some comments about service providers and 
> subscriptions to JSTOR.  JSTOR actually has more user groups than just 
> service providers.
> We also deal directly with Libraries (who may want the metadata to 
> create their own search engines), Publishers (who will want the metadata 
> for an entire run of a journal and have no
> access restrictions at all), and other business partners who want access 
> to our metadata.
> Having such diverse groups with varying technical skills raises a number 
> of issues - among them is how can we get the user the metadata that only 
> they want/need and what are the implications in regards to OAI selective 
> harvesting rules.
> 2)  As far as sharing ALL our metadata - this would greatly simplify my 
> life in regards to this issue, but it is a business decision that is out 
> of my hands.  I would still restrict by Journal, so users would only get 
> metadata for those journals they subscribe to, but I would let them see 
> ALL content for that journal including content that is not yet available 
> on the public site (usually due to agreements with publishers).  These 
> records could be flagged as "not yet publically available" and 
> consequently screened out by the end user.
> There are two major problems I see with this approach:
> a)  Metadata has some inherent value to it.  What's stopping someone 
> from providing links to other content providers using our metadata to 
> point to other providers?  Perhaps this question could be worked out in 
> a legal metadata sharing contract.    I said before this is a business 
> decision that is out of my hands for now...
> b)  Users may not want to screen out large chunks of content that they 
> can't yet see.  I'm already worried about the technical barriers that 
> using OAI may provide for some of our less technically inclined 
> partners, this might further complicate the process for them.
> Yeah, yeah...information wants to be free...I know that song and dance.  
> It certainly would make life easier, but I'm not sure it makes good 
> policy.,
> 3)   Do I have this right that the "creation date" for a given object is 
> subjective?  That is, does the creation date refer to the date that this 
> object became available to repository for that particular user?  I'm 
> guessing yes.
> If this is the case, we can potenially do some behind-the-sceens 
> spoofing of the creation date to reflect the time that this record 
> became available to the user.  We would aslo have to spoof the modified 
> date in the same way, so that no record had a modified date older than 
> the creation date.    This would be a fairly complex process and would 
> require us to maintain information about what a user was able to see at 
> a given point in time.  It would also require us to gather data about 
> the record such as the published date (this is how we restrict access) 
> and the date the record was publically released.  A difficult problem, 
> but not impossible.
> 4)  The virtual repository idea is interesting, but would likely be 
> unmangeable if we start getting large amounts of users when are dealing 
> with millions of records.
> Thanks to you all - I really appreciate your help!
> Michael Krot
> Data Manager
> _______________________________________________
> OAI-implementers mailing list
> OAI-implementers@oaisrv.nsdl.cornell.edu
> http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers

hussein suleman ~ hussein@cs.uct.ac.za ~ http://www.husseinsspace.com