[OAI-implementers] RE: Change of identifiers

José Borbinha jlb at ist.utl.pt
Fri Dec 9 05:28:59 EST 2011


Hi Thomas
I’m afraid the issue you are raising is out of the scope of OAI, as IMHO it
is part of the “upper level” of the business.
 
In fact, in a properly designed data/entity architecture, records should
have two identifiers, one absolute, purely logical, and persistent (like for
example an ISBN number, DOI, National Bibliographic Number, etc., with no
need to be sequential
), and eventually other “technical” (as the index in
the database, for example, which tend to be sequential, and this OAI takes
advantage of that). Both can even have be represented by elements in the
data record
 
 
The problem is that when people design the systems they usually only use the
second, either by ignorance or by simplicity
 For example, if you use
logical identifiers, you also must have a mechanism to merge records if you
discover duplicates, and in that case you must keep the two previous
identifiers associated to the new record, etc.
 
In a scenario of a system properly  designed, if the OAI identifiers change,
that should be seem “at the OAI level” as a simple change in the record, so
all the records should be harvested again, and end of story at the OAI
level

 
Than it’d be up to the next level of the business to discover what really
was the change in the record and how it affects the business of the service
provider that harvested it
 If the data provider did it properly, by using
the logical identifiers the service providers can always make the
association of the new records with the old ones
 If not, then I’m afraid
the only options for the service provider are either also throw the old
harvested records (which can be nasty if the service provider already used
them to create any kind of added value, such as associate them to records in
other data sets
) or develop its own heuristics to try to recover the ideal
scenario (for example, to deal with that in the REPOX framework, when a data
set is harvested and the data has no logical identifiers, we make it
possible for REPOX to generate those identifiers, based on simples
techniques that use the content of the record but we assume in most of the
cases will generate the same identifier if the record is not changed in some
key identifiers –this is just a simple heuristic, not 100% effective, but at
leat we try ;-)
 
Best!
José Borbinha
 
 
 
-----Original Message-----
From: oai-implementers-bounces at openarchives.org
[mailto:oai-implementers-bounces at openarchives.org] On Behalf Of Fischer,
Thomas
Sent: 06 December 2011 15:21
To: oai-implementers at openarchives.org
Subject: [OAI-implementers] Change of identifiers
 
Hello,
 
I am wondering if there is a standard way to change OAI identifiers.
For example, we might restructure our collection, or will have to split some
items that were under one identifier into several ones. I could even imagine
something like "throw away everything from this set from this OAI Service
Provider and collect everything anew".
Is there a way to communicate this to my OAI clients?
 
Best regards
Thomas Fischer
 
--
Dr. Thomas Fischer
Research and Development Department (RDD)
Göttingen State and University Library
Georg-August-Universität Göttingen
37073 Goettingen
Germany
 
http://www.sub.uni-goettingen.de/
 
 
_______________________________________________
OAI-implementers mailing list
List information, archives, preferences and to unsubscribe:
http://www.openarchives.org/mailman/listinfo/oai-implementers
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.openarchives.org/pipermail/oai-implementers/attachments/20111209/1df87708/attachment-0001.htm


More information about the OAI-implementers mailing list