From stamer@uni-oldenburg.de Tue Apr 1 09:37:56 2003 From: stamer@uni-oldenburg.de (Heinrich Stamerjohanns) Date: Tue, 1 Apr 2003 11:37:56 +0200 (CEST) Subject: [OAI-implementers] oai response XML Schema checking In-Reply-To: <20030331193059.GF10806@openlib.org> Message-ID: On Mon, 31 Mar 2003, Thomas Krichel wrote: > > Folks, > > I want to check my OAI repository regularly, by > going through all the responses that it > can generate and check them one by one. I had > gotten Xerces java 1 to run, but I can no longer > do it. I have played with the flags that they > suggest in the documentation, and either it checks > only the well-formedness of the response or it > comes up with an error that says > > Document root element "{1}", must match DOCTYPE root "{0}". > > presumably suggesting that there is something > wrong with the namespaces. Same thing with xerces-2_4_0. > The command that I give is > > CLASSPATH=/home/oaiadm/java/xerces-2_4_0/xercesImpl.jar:/home/oaiadm/java/xerces > -2_4_0/xercesSamples.jar:/home/oaiadm/java/xerces-2_4_0/xml-apis.jar:/home/oaiad > m/java/xerces-2_4_0/xmlParserAPIs.jar:; export CLASSPATH ; java dom.Counter -v / > var/tmp/Identifiers.amf.xml Hi Thomas, It is not a problem of your document; if I remember right, this happens because of conflicting classes feeling responsible to do the parsing. (I think from xalan or saxon). Xerces worked for me again after I had installed the newest java by sun and by removing any unneccessary classes from CLASSPATH. (maybe there are some old ones in jre/lib/ext? You can also download xsv directly and install locally. It works very well. Greetings, Heinrich -- Dr. Heinrich Stamerjohanns Tel. +49-441-798-4276 Institute for Science Networking stamer@uni-oldenburg.de University of Oldenburg http://isn.uni-oldenburg.de/~stamer From jozef@nl.adlibsoft.com Tue Apr 1 14:46:48 2003 From: jozef@nl.adlibsoft.com (Jozef Kruger) Date: Tue, 1 Apr 2003 16:46:48 +0200 Subject: [OAI-implementers] Questions about wrong output due to terrible input Message-ID: <4E232B133AC9F04BB194C2AE2024EF9205C200@saturnus.nl.adlibsoft.com> This is a multi-part message in MIME format. ------=_NextPartTM-000-955999bb-b6a0-49c0-91e7-684ae93e84ee Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C2F85D.8653E09E" ------_=_NextPart_001_01C2F85D.8653E09E Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: quoted-printable Hi everyone, =20 just this week someone from my firm tested my OAI implementation and he sent me a report with the results. What he did was give my program some terrible input with things like set=3Dnon_existing_made_up_set or from=3Dvery_illegal_date The protocol isn't being very specific about the arguments that are returned in the header (in the request node I mean). What I did was just return the the arguments that mattered (for each verb) the way they came in, resulting in invalid output in these cases. =20 Should I check for each of these if they contain any illegal stuff? If so and I would skip any illegal ones, the output wouldn't match with the input anymore. You could for example get error code=3D"noRecordsMatch" due to an = illegal date, but in the output you wouldn't see that date anymore. =20 I think the solution to this kind of problem would be a check before sending the request to the repository. But than again, you just might still be left with illegal input.. so omitting those things in the output looks like the only right solution. =20 Any thoughts on these matters? =20 cheers, Jozef Kruger (Adlib Information Systems B.V. the Netherlands) ------_=_NextPart_001_01C2F85D.8653E09E Content-Type: text/html; charset="US-ASCII" Content-Transfer-Encoding: quoted-printable Message
Hi=20 everyone,
 
just = this week=20 someone from my firm tested my OAI implementation and he sent me a = report with=20 the results.
What = he did was give=20 my program some terrible input with things like = set=3Dnon_existing_made_up_set or=20 from=3Dvery_illegal_date
The = protocol isn't=20 being very specific about the arguments that are returned in the header = (in the=20 request node I mean).
What I = did was just=20 return the the arguments that mattered (for each verb) the way they came = in,=20 resulting in invalid output in these cases.
 
Should = I check for=20 each of these if they contain any illegal stuff? If so and I would skip = any=20 illegal ones, the output wouldn't match with the input=20 anymore.
You = could for=20 example get error = code=3D"noRecordsMatch" due to an = illegal date,=20 but in the output you wouldn't see that date = anymore.
 
I=20 think the solution to this kind of problem would be a check before = sending the=20 request to the repository. But than again, you just might still be left = with=20 illegal input.. so omitting those things in the output looks like the = only right=20 solution.
 
Any=20 thoughts on these matters?
 
cheers,
Jozef=20 Kruger (Adlib Information Systems B.V. the=20 Netherlands)
=00 ------_=_NextPart_001_01C2F85D.8653E09E-- ------=_NextPartTM-000-955999bb-b6a0-49c0-91e7-684ae93e84ee-- From simeon@cs.cornell.edu Tue Apr 1 15:15:18 2003 From: simeon@cs.cornell.edu (Simeon Warner) Date: Tue, 1 Apr 2003 10:15:18 -0500 (EST) Subject: [OAI-implementers] Questions about wrong output due to terrible input In-Reply-To: <4E232B133AC9F04BB194C2AE2024EF9205C200@saturnus.nl.adlibsoft.com> Message-ID: On Tue, 1 Apr 2003, Jozef Kruger wrote: > Hi everyone, > > just this week someone from my firm tested my OAI implementation and he > sent me a report with the results. > What he did was give my program some terrible input with things like > set=non_existing_made_up_set or from=very_illegal_date > The protocol isn't being very specific about the arguments that are > returned in the header (in the request node I mean). > What I did was just return the the arguments that mattered (for each > verb) the way they came in, resulting in invalid output in these cases. > > Should I check for each of these if they contain any illegal stuff? If > so and I would skip any illegal ones, the output wouldn't match with the > input anymore. See section 3.2 in the protocol spec, bullet point 3. You must include the arguments of the request as attributes of the element of the response if the request did not generate an error. You must NOT include the arguments of the request as attributes of the element of the response if the request generated a badVerb or badArgument error. (The spec does not appear to be specific on the case when there is an error or exception which is not badArgument/badVerb. However, that doesn't matter for the present discussion because in that case the arguments will be schema-valid.) > You could for example get error code="noRecordsMatch" due to an illegal > date, but in the output you wouldn't see that date anymore. > > I think the solution to this kind of problem would be a check before > sending the request to the repository. But than again, you just might > still be left with illegal input.. so omitting those things in the > output looks like the only right solution. The harvester SHOULD only send sensible parameters but the repository must be able to sensibly handle bad requests. Cheers, Simeon. > Any thoughts on these matters? > > cheers, > Jozef Kruger (Adlib Information Systems B.V. the Netherlands) > From heronation@pop.com.br Wed Apr 2 18:04:51 2003 From: heronation@pop.com.br (Luis Alberto de Quadros) Date: Wed, 02 Apr 2003 15:04:51 -0300 Subject: [OAI-implementers] implement a repository Message-ID: <20030402180452.26131.qmail@idhost> Hi, i'm starting with OAI-PMH. I studied a lot the docs existing in the OAI's site, but i didn't found any information about how to start implement a repository. I would like to know if somebody can tell me how to start and where can i get these informations. Thanks a lot ! Luis Quadros -- POP. Nem parece internet grátis. Seja POP você também! Acesse: http://www.pop.com.br/pop_discador.php e baixe o POPdiscador. From hussein@cs.uct.ac.za Thu Apr 3 11:22:06 2003 From: hussein@cs.uct.ac.za (Hussein Suleman) Date: Thu, 03 Apr 2003 13:22:06 +0200 Subject: [OAI-implementers] implement a repository References: <20030402180452.26131.qmail@idhost> Message-ID: <3E8C195E.800@cs.uct.ac.za> hi Luis if you already have a collection of metadata and/or data using digital library software such as EPrints, DSpace or Greenstone for management/access, then you must look at the documentation for that software for how to activate the OAI interface. if you already have a collection of metadata and/or data, but the software to manage/access the collection is largely home-grown (e.g., a bunch of Web scripts/applications) or does not already support the OAI-PMH, then you or a developer of the software needs to write code to implement the OAI-PMH specification for your collection - the spec is on the OAI website and isn't too complicated to get up and running if you know a bit about Web applications. if you have the metadata and/or data but aren't using any software to manage it, then may i recommend looking at EPrints (www.eprints.org) and Greenstone (www.nzdl.org) as possible candidates. there are lots of other packages, but if you want a widely-used standalone package with OAI support, these may be a good starting point. alternatively, look at the OAI website and you will find little tools to make "mini-repositories", e.g., the "OAI Static Repository" for repositories that don't change over time and the "XML File-based Repository" for repositories where each metadata object is a single (usually XML) file. lastly, if you have neither the data nor the software already, i suggest that you determine the requirements for your repository, and then adopt an existing package (e.g., EPrints, Greenstone, DSpace, CDSWare, GaneshaDL, ...) to manage your digital library. ... now if you need help on how to write code for an OAI interface, i would suggest looking at the tutorials that are available on OAI-PMH. if you go through the OAI's website, there is an older article by Simeon Warner linked from the "Documents/OAI-related papers" page. (if someone has newer material, please post links). i also have slides from last year's JCDL tutorial on practical implementation of the protocol - these can be found at: http://www.dlib.vt.edu/projects/OAI/index.html#reports hope this info helps. ttfn, ----hussein ps should we have something like this - maybe a flowchart - in the FAQ? Luis Alberto de Quadros wrote: > Hi, > i'm starting with OAI-PMH. I studied a lot the docs existing in the > OAI's site, but i didn't found any information about how to start > implement a repository. > I would like to know if somebody can tell me how to start and where can > i get these informations. > Thanks a lot ! > Luis Quadros > > -- ===================================================================== hussein suleman ~ hussein@cs.uct.ac.za ~ http://www.husseinsspace.com ===================================================================== From tim@tim.brody.btinternet.co.uk Thu Apr 3 13:44:41 2003 From: tim@tim.brody.btinternet.co.uk (Tim Brody) Date: Thu, 03 Apr 2003 14:44:41 +0100 Subject: [OAI-implementers] implement a repository In-Reply-To: <3E8C195E.800@cs.uct.ac.za> References: <20030402180452.26131.qmail@idhost> <3E8C195E.800@cs.uct.ac.za> Message-ID: <3E8C3AC9.9020005@tim.brody.btinternet.co.uk> > ps should we have something like this - maybe a flowchart - in the FAQ? Set up a Wiki and it would be relatively easy to create as questions crop up, e.g.: Q1. Are you adding OAI-PMH support to an existing repository system, or creating a repository from scratch? Yes [Link to next q.] No [Link to next q.] ... As an aside, I found Simeon's tutorial very useful - perhaps he could re-visit it for OAI 2? All the best, Tim. > Luis Alberto de Quadros wrote: > >> Hi, >> i'm starting with OAI-PMH. I studied a lot the docs existing in the >> OAI's site, but i didn't found any information about how to start >> implement a repository. >> I would like to know if somebody can tell me how to start and where >> can i get these informations. >> Thanks a lot ! >> Luis Quadros >> >> > > From is@ime.usp.br Thu Apr 3 19:57:56 2003 From: is@ime.usp.br (Imre Simon) Date: Thu, 3 Apr 2003 16:57:56 -0300 Subject: [OAI-implementers] implement a repository Message-ID: <16012.37444.700259.848438@gargle.gargle.HOWL> We have a wiki site dedicated to eprints/OAI questions. It is almost empty right now but it is planned to be a companion site to an institutional eprints server we are installing (and which also is almost empty right now). This wiki is going to run in Portuguese mainly but if you think it might be helpful to develop the FAQ alluded to, please feel free to use it at will. Even if it is just for an experiment or as a temporary test site. The eprints server is here: http://eprints.ime.usp.br/ The companion wiki is here: http://www.arca.ime.usp.br/coruja/wiki/ If you think this is a good idea, please let me know. We will make the necessary changes to the Home Page so as to announce the FAQ in construction (in English). Suggestions are welcome. Cheers, Imre Simon http://www.ime.usp.br/~is/ > Date: Thu, 03 Apr 2003 14:44:41 +0100 > From: Tim Brody > To: Hussein Suleman > CC: oai-implementers@oaisrv.nsdl.cornell.edu > Subject: Re: [OAI-implementers] implement a repository > > > ps should we have something like this - maybe a flowchart - in the FAQ? > > Set up a Wiki and it would be relatively easy to create as questions > crop up, e.g.: > > Q1. Are you adding OAI-PMH support to an existing repository system, or > creating a repository from scratch? > > Yes [Link to next q.] > > No [Link to next q.] > > ... > > As an aside, I found Simeon's tutorial very useful - perhaps he could > re-visit it for OAI 2? > > All the best, > Tim. From pally_reddy@yahoo.com Mon Apr 7 21:35:57 2003 From: pally_reddy@yahoo.com (Venugopal Reddy Pally) Date: Mon, 7 Apr 2003 13:35:57 -0700 (PDT) Subject: [OAI-implementers] Beginner questions Message-ID: <20030407203557.8667.qmail@web40510.mail.yahoo.com> Hi all, These questions may be very basic but I am a beginner. So please bear with me. I am following oai_dc XML Schema for displaying metadata of any Record in GetRecord or ListRecords response. There is 'identifier' element both in header and metadata of Record. So what is the difference between identifier element in header and identifier element in metadata ? Also, what are the elements - Type, Format in metadata ? Thanks, Venu. __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - File online, calculators, forms, and more http://tax.yahoo.com From tim@tim.brody.btinternet.co.uk Tue Apr 8 02:21:18 2003 From: tim@tim.brody.btinternet.co.uk (Tim Brody) Date: Tue, 08 Apr 2003 02:21:18 +0100 Subject: [OAI-implementers] Beginner questions In-Reply-To: <20030407203557.8667.qmail@web40510.mail.yahoo.com> References: <20030407203557.8667.qmail@web40510.mail.yahoo.com> Message-ID: <3E92240E.7090106@tim.brody.btinternet.co.uk> Identifier in the metadata is a Dublin Core identifier, identifier in the header is a unique identifier within your repository. Dublin Core is a list of unordered, repeatable string values (although you should use proper URIs, ISO dates etc. where applicable). See the Dublin Core docs for what Type, Format, etc. are: http://www.dublincore.org/documents/dces/ (Dublincore.org seems to be down atm so I can't check that's the description doc ...) e.g. Header.identifier = "oai:myrepos:uniqueid10" DC.identifier = "http://www.myrepos.com/item/10" DC.identifier = "oai:myrepos:unique10" (i.e. you could also include the OAI identifier in the metadata) DC.identifier = "New Order Journal, V202 30-40" DC.type = "Text" DC.format = "text/html" DC.date = "2002-04" and so on ... All the best, Tim. Venugopal Reddy Pally wrote: > Hi all, > These questions may be very basic but I am a > beginner. So please bear with me. > I am following oai_dc XML Schema for displaying > metadata of any Record in GetRecord or ListRecords > response. There is 'identifier' element both in header > and metadata of Record. So what is the difference > between identifier element in header and identifier > element in metadata ? Also, what are the elements - > Type, Format in metadata ? > Thanks, > Venu. > > __________________________________________________ > Do you Yahoo!? > Yahoo! Tax Center - File online, calculators, forms, and more > http://tax.yahoo.com > _______________________________________________ > OAI-implementers mailing list > OAI-implementers@oaisrv.nsdl.cornell.edu > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > From a.powell@ukoln.ac.uk Tue Apr 8 11:19:46 2003 From: a.powell@ukoln.ac.uk (Andy Powell) Date: Tue, 8 Apr 2003 11:19:46 +0100 (BST) Subject: [OAI-implementers] Beginner questions In-Reply-To: <3E92240E.7090106@tim.brody.btinternet.co.uk> Message-ID: On Tue, 8 Apr 2003, Tim Brody wrote: > Identifier in the metadata is a Dublin Core identifier, identifier in > the header is a unique identifier within your repository. The identifier in the metadata is the identifier of the resource being described - it is often a URL for the resource. The identifier in the OAI header is the identifier of the OAI 'item' (essentially the identifier of the metadata record(s) about the resource). > Dublin Core is a list of unordered, repeatable string values (although > you should use proper URIs, ISO dates etc. where applicable). > > See the Dublin Core docs for what Type, Format, etc. are: > http://www.dublincore.org/documents/dces/ Note: the correct URL is http://dublincore.org/documents/dces/ dc:type indicates the 'genre' of the resource (text, image, ...). dc:format indicates the something about the 'digital manifestation' of the resource (e.g. the way it is encoded). I would recommend that dc:format should be a MIME type (for digital resources) and that you consider selecting the value of dc:type from the DCMIType vocabulary at http://dublincore.org/documents/dcmi-type-vocabulary/ Repeat dc:type to indicate other (more specific) genres if necessary. > (Dublincore.org seems to be down atm so I can't check that's the > description doc ...) > > e.g. > Header.identifier = "oai:myrepos:uniqueid10" > > DC.identifier = "http://www.myrepos.com/item/10" > DC.identifier = "oai:myrepos:unique10" (i.e. you could also include the > OAI identifier in the metadata) I tend to to disagree with this. The OAI identifier is *not* an identifier of the resource being described - it is an identifier of the OAI item - therefore it shouldn't be repeated in the DC description. Finally, if you are describing eprints, then you might want to take a look at http://www.rdn.ac.uk/projects/eprints-uk/docs/simpledc-guidelines/ for some DC usage guidelines. Andy. > DC.identifier = "New Order Journal, V202 30-40" > DC.type = "Text" > DC.format = "text/html" > DC.date = "2002-04" > and so on ... > > All the best, > Tim. > > Venugopal Reddy Pally wrote: > > Hi all, > > These questions may be very basic but I am a > > beginner. So please bear with me. > > I am following oai_dc XML Schema for displaying > > metadata of any Record in GetRecord or ListRecords > > response. There is 'identifier' element both in header > > and metadata of Record. So what is the difference > > between identifier element in header and identifier > > element in metadata ? Also, what are the elements - > > Type, Format in metadata ? > > Thanks, > > Venu. > > > > __________________________________________________ > > Do you Yahoo!? > > Yahoo! Tax Center - File online, calculators, forms, and more > > http://tax.yahoo.com > > _______________________________________________ > > OAI-implementers mailing list > > OAI-implementers@oaisrv.nsdl.cornell.edu > > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > > > > _______________________________________________ > OAI-implementers mailing list > OAI-implementers@oaisrv.nsdl.cornell.edu > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > Andy -- Distributed Systems, UKOLN, University of Bath, Bath, BA2 7AY, UK http://www.ukoln.ac.uk/ukoln/staff/a.powell +44 1225 383933 Resource Discovery Network http://www.rdn.ac.uk/ From tim@tim.brody.btinternet.co.uk Tue Apr 8 14:06:32 2003 From: tim@tim.brody.btinternet.co.uk (Tim Brody) Date: Tue, 8 Apr 2003 14:06:32 +0100 Subject: [OAI-implementers] Beginner questions References: Message-ID: <000701c2fdcf$ad04dd70$14414e98@Shrek> ----- Original Message ----- From: "Andy Powell" > On Tue, 8 Apr 2003, Tim Brody wrote: > > > DC.identifier = "http://www.myrepos.com/item/10" > > DC.identifier = "oai:myrepos:unique10" (i.e. you could also include the > > OAI identifier in the metadata) > > I tend to to disagree with this. The OAI identifier is *not* an > identifier of the resource being described - it is an identifier of the > OAI item - therefore it shouldn't be repeated in the DC description. At the risk of being pedantic :-) This is referring to the oai-identifier scheme, rather than the more general OAI identifier (unique within a repository and a URI). "An item is conceptually a container that stores or dynamically generates metadata about a single resource in multiple formats", so it follows that the identifier for the OAI item uniquely identifies the resource, which fulfills the DC definition: "An unambiguous reference to the resource within a given [OAI] context." What is not defined is how to resolve an oai-identifier to the resource, but DC doesn't require that. All the best, Tim. From a.powell@ukoln.ac.uk Tue Apr 8 15:03:48 2003 From: a.powell@ukoln.ac.uk (Andy Powell) Date: Tue, 8 Apr 2003 15:03:48 +0100 (GMT Standard Time) Subject: [OAI-implementers] Beginner questions In-Reply-To: <000701c2fdcf$ad04dd70$14414e98@Shrek> References: <000701c2fdcf$ad04dd70$14414e98@Shrek> Message-ID: On Tue, 8 Apr 2003, Tim Brody wrote: > ----- Original Message ----- > From: "Andy Powell" > > On Tue, 8 Apr 2003, Tim Brody wrote: > > > > > DC.identifier = "http://www.myrepos.com/item/10" > > > DC.identifier = "oai:myrepos:unique10" (i.e. you could also include the > > > OAI identifier in the metadata) > > > > I tend to to disagree with this. The OAI identifier is *not* an > > identifier of the resource being described - it is an identifier of the > > OAI item - therefore it shouldn't be repeated in the DC description. > > At the risk of being pedantic :-) > > This is referring to the oai-identifier scheme, rather than the more general > OAI identifier (unique within a repository and a URI). > > "An item is conceptually a container that stores or dynamically generates > metadata about a single resource in multiple formats", so it follows that > the identifier for the OAI item uniquely identifies the resource, which > fulfills the DC definition: > "An unambiguous reference to the resource within a given [OAI] context." Well, I still tend to disagree :-). I assume that you are refering to http://www.openarchives.org/OAI/openarchivesprotocol.html#Item and below... I quote: Note that the identifier described here [the item identifier] is not that of a resource . The nature of a resource identifier is outside the scope of the OAI-PMH. To facilitate access to the resource associated with harvested metadata, repositories should use an element in metadata records to establish a linkage between the record (and the identifier of its item) and the identifier (URL, URN, DOI, etc.) of the associated resource. The mandatory Dublin Core format provides the identifier element that should be used for this purpose. The item identifier is not the same as the resource identifier - because the item is not the same as the resource. Andy. > What is not defined is how to resolve an oai-identifier to the resource, but > DC doesn't require that. > > All the best, > Tim. > > _______________________________________________ > OAI-implementers mailing list > OAI-implementers@oaisrv.nsdl.cornell.edu > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > From Naomi@cs.cornell.edu Tue Apr 8 16:12:20 2003 From: Naomi@cs.cornell.edu (Naomi Dushay) Date: Tue, 8 Apr 2003 11:12:20 -0400 Subject: [OAI-implementers] Beginner questions Message-ID: One of the ways we think about the two different identifiers is this: The DC identifier is an identifier for the RESOURCE. In fact, our rule of thumb is that any information within the metadata tag is about the RESOURCE. The OAI identifier is an identifier for METADATA for the resource. And again, information in the header tag is about METADATA for the RESOURCE. We tend to think of the about tag as being about the METADATA as well. - Naomi Dushay National Science Digital Library From pally_reddy@yahoo.com Tue Apr 8 19:17:09 2003 From: pally_reddy@yahoo.com (Venugopal R Pally) Date: Tue, 8 Apr 2003 11:17:09 -0700 (PDT) Subject: [OAI-implementers] Beginner questions In-Reply-To: Message-ID: <20030408181709.21809.qmail@web40513.mail.yahoo.com> Thanks. This has made my questions cleared. could you please inform me how 'METADATA' and 'RESOURCE' are being used by harvestors or by any user ? Since we are getting records by ListRecords, GetRecord, how ListIdentifiers is used ? Thanks, Venu. --- Naomi Dushay wrote: > One of the ways we think about the two different > identifiers is this: > > > The DC identifier is an identifier for the RESOURCE. > In fact, our rule > of thumb is that any information within the metadata > tag is about the > RESOURCE. > > The OAI identifier is an identifier for METADATA for > the resource. And > again, information in the header tag is about > METADATA for the RESOURCE. > We tend to think of the about tag as being about the > METADATA as well. > > > - Naomi Dushay > National Science Digital Library > _______________________________________________ > OAI-implementers mailing list > OAI-implementers@oaisrv.nsdl.cornell.edu > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - File online, calculators, forms, and more http://tax.yahoo.com From Naomi@cs.cornell.edu Tue Apr 8 20:25:57 2003 From: Naomi@cs.cornell.edu (Naomi Dushay) Date: Tue, 8 Apr 2003 15:25:57 -0400 Subject: FW: [OAI-implementers] Beginner questions Message-ID: -----Original Message----- From: Venugopal R Pally [mailto:pally_reddy@yahoo.com] Sent: Tuesday, April 08, 2003 1:14 PM I'm forwarding on Venu's next questions -- anyone want to take a crack at them? - Naomi To: Naomi Dushay Subject: RE: [OAI-implementers] Beginner questions Thanks. This has made my questions cleared. could you please inform me how 'METADATA' and 'RESOURCE' are being used by harvestors or by any user ? Since we are getting records by ListRecords, GetRecord, how ListIdentifiers is used ? Thanks, Venu. --- Naomi Dushay wrote: > One of the ways we think about the two different > identifiers is this: > > > The DC identifier is an identifier for the RESOURCE. > In fact, our rule > of thumb is that any information within the metadata > tag is about the > RESOURCE. > > The OAI identifier is an identifier for METADATA for > the resource. And > again, information in the header tag is about > METADATA for the RESOURCE. > We tend to think of the about tag as being about the > METADATA as well. > > > - Naomi Dushay > National Science Digital Library > _______________________________________________ > OAI-implementers mailing list > OAI-implementers@oaisrv.nsdl.cornell.edu > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - File online, calculators, forms, and more http://tax.yahoo.com From hussein@cs.uct.ac.za Tue Apr 8 22:56:35 2003 From: hussein@cs.uct.ac.za (Hussein Suleman) Date: Tue, 08 Apr 2003 23:56:35 +0200 Subject: [OAI-implementers] Beginner questions References: <20030408181709.21809.qmail@web40513.mail.yahoo.com> Message-ID: <3E934593.5080406@cs.uct.ac.za> hi Venu i am not sure about your first question - are you looking for usage statistics for oai-harvested data ? if so, every now and then service providers publish whatever they do know on this list or oai-general (sometimes as pointers to publications). there is no global tracking for usage of harvested data. the trivial answer is that for most service providers, 100% of accesses is to oai-harvested "metadata" since they get their data strictly from data providers. i dont know if anybody tracks "resource" usage however. for the second question, there are two ways to harvest: either by using listrecords or by using listidentifiers and getrecords (there are probably numerous articles/papers that discuss these). it can be argued that the former is all we need, but listidentifiers has been used by some harvesters for periodic consistency checks - to confirm existence of records without retrieving them. ... if you provide more information on your requests, you will probably get more directed responses ... ttfn, ----hussein Venugopal R Pally wrote: > Thanks. This has made my questions cleared. could you > please inform me how 'METADATA' and 'RESOURCE' are > being used by harvestors or by any user ? Since we are > getting records by ListRecords, GetRecord, how > ListIdentifiers is used ? > Thanks, > Venu. > > --- Naomi Dushay wrote: > >>One of the ways we think about the two different >>identifiers is this: >> >> >>The DC identifier is an identifier for the RESOURCE. >> In fact, our rule >>of thumb is that any information within the metadata >>tag is about the >>RESOURCE. >> >>The OAI identifier is an identifier for METADATA for >>the resource. And >>again, information in the header tag is about >>METADATA for the RESOURCE. >>We tend to think of the about tag as being about the >>METADATA as well. >> >> >>- Naomi Dushay >>National Science Digital Library >>_______________________________________________ >>OAI-implementers mailing list >>OAI-implementers@oaisrv.nsdl.cornell.edu >> > > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > > > __________________________________________________ > Do you Yahoo!? > Yahoo! Tax Center - File online, calculators, forms, and more > http://tax.yahoo.com > _______________________________________________ > OAI-implementers mailing list > OAI-implementers@oaisrv.nsdl.cornell.edu > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers -- ===================================================================== hussein suleman ~ hussein@cs.uct.ac.za ~ http://www.husseinsspace.com ===================================================================== From caar@loc.gov Wed Apr 9 15:31:27 2003 From: caar@loc.gov (Caroline Arms) Date: Wed, 9 Apr 2003 10:31:27 -0400 (EDT) Subject: [OAI-implementers] Beginner questions In-Reply-To: <20030408181709.21809.qmail@web40513.mail.yahoo.com> Message-ID: One of the advantages of OAI-PMH from the point of view of some data providers is that they don't need to know exactly HOW their metadata is being used. However, I do check a few particular sites. OAIster http://oaister.umdl.umich.edu/o/oaister/ and the UIUC Cultural Heritage Repository http://oai.grainger.uiuc.edu/oai/search harvest the metadata records for digitized historical materials that the Library of Congress (LC) makes available. A Sheet Music Consortium hosted at UCLA http://digital.library.ucla.edu/sheetmusic/ harvests LC's set of records for digitized sheet music. These service providers all harvest and index the records and link to the resource as mounted at LC. The OAI protocol does not itself support the harvesting of the underlying resources, and whether or not it is permitted will depend on the repository. LC has one arrangement with another institution that involves harvesting the records and then use of the records to facilitate harvesting of resources outside the protocol. On possible uses for the ListIdentifiers request: 1. It can be used to count records, which can be useful in scheduling harvesting of the records themselves. 2. I proves useful for LC as a data-provider for verifying batch additions or set membership. 3. I had always assumed that very selective harvesters might harvest a set of records, filter the set in some way, and retain a list of identifiers for the items in the subset. This list could be used to control re-harvesting. Whether any current harvesters are doing this I don't know. I hope that helps. Caroline Arms caar@loc.gov Library of Congress On Tue, 8 Apr 2003, Venugopal R Pally wrote: > Thanks. This has made my questions cleared. could you > please inform me how 'METADATA' and 'RESOURCE' are > being used by harvestors or by any user ? Since we are > getting records by ListRecords, GetRecord, how > ListIdentifiers is used ? > Thanks, > Venu. > > --- Naomi Dushay wrote: > > One of the ways we think about the two different > > identifiers is this: > > > > > > The DC identifier is an identifier for the RESOURCE. > > In fact, our rule > > of thumb is that any information within the metadata > > tag is about the > > RESOURCE. > > > > The OAI identifier is an identifier for METADATA for > > the resource. And > > again, information in the header tag is about > > METADATA for the RESOURCE. > > We tend to think of the about tag as being about the > > METADATA as well. > > > > > > - Naomi Dushay > > National Science Digital Library > > _______________________________________________ > > OAI-implementers mailing list > > OAI-implementers@oaisrv.nsdl.cornell.edu > > > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > > > __________________________________________________ > Do you Yahoo!? > Yahoo! Tax Center - File online, calculators, forms, and more > http://tax.yahoo.com > _______________________________________________ > OAI-implementers mailing list > OAI-implementers@oaisrv.nsdl.cornell.edu > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > From pally_reddy@yahoo.com Mon Apr 14 16:50:21 2003 From: pally_reddy@yahoo.com (Venugopal R Pally) Date: Mon, 14 Apr 2003 08:50:21 -0700 (PDT) Subject: [OAI-implementers] OAI Resource Message-ID: <20030414155021.37958.qmail@web40502.mail.yahoo.com> Hi all, The OAI says that 'resource' is the object or stuff that metadata is about. So, can resources include multiple types ? For example, in our case, I identified research projects as resources. But later I found that harvestors would like to search our archive based on certain other things like Author, his Papers etc. This would mean I should consider Authors, Paper titles also as resources along with research projects. So, when a harvestor asks for ListIdentifiers, can I display all of these (Research Projects, Authors, Paper Titles) ? Or should I use different metadataPrefix for different resources ? Thanks, Venu. __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - File online, calculators, forms, and more http://tax.yahoo.com From jyoung@oclc.org Mon Apr 14 18:11:47 2003 From: jyoung@oclc.org (Young,Jeff) Date: Mon, 14 Apr 2003 13:11:47 -0400 Subject: [OAI-implementers] OAI Resource Message-ID: I'd say the answer is no, you don't want to do that. OAI isn't a search protocol, it's a simple harvesting protocol. If you really do need to search your database by these fields you will need to use a different protocol such a Z39.50 or SRU/SRW and use it to index those fields from your research project records. Also keep in mind that the main reason people make your metadata records available via OAI is so others (aka service providers) can make them useful and searchable in this way. Basically, it sounds like you want more functionality than OAI alone provides. Check out EPrints or DSpace if you need a more complete archiving solution. Jeff > -----Original Message----- > From: Venugopal R Pally [mailto:pally_reddy@yahoo.com] > Sent: Monday, April 14, 2003 11:50 AM > To: oai-implementers@oaisrv.nsdl.cornell.edu > Subject: [OAI-implementers] OAI Resource > > > Hi all, > The OAI says that 'resource' is the object or stuff > that metadata is about. So, can resources include > multiple types ? For example, in our case, I > identified research projects as resources. But later I > found that harvestors would like to search our archive > based on certain other things like Author, his Papers > etc. This would mean I should consider Authors, Paper > titles also as resources along with research projects. > So, when a harvestor asks for ListIdentifiers, can I > display all of these (Research Projects, Authors, > Paper Titles) ? Or should I use different > metadataPrefix for different resources ? > Thanks, > Venu. > > __________________________________________________ > Do you Yahoo!? > Yahoo! Tax Center - File online, calculators, forms, and more > http://tax.yahoo.com > _______________________________________________ > OAI-implementers mailing list > OAI-implementers@oaisrv.nsdl.cornell.edu > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > From simeon@cs.cornell.edu Mon Apr 14 18:35:08 2003 From: simeon@cs.cornell.edu (Simeon Warner) Date: Mon, 14 Apr 2003 13:35:08 -0400 (EDT) Subject: [OAI-implementers] OAI Resource In-Reply-To: Message-ID: I agree with Jeff and feel that overloading the selective harvesting mechanisms (sets, metadata formats) with search functionality is not the best way to approach these issues. You should either use a protocol that supports remote search, or provide that functionality at the service layer (think of the OAI repository as one layer down). Cheers, Simeon. On Mon, 14 Apr 2003, Young,Jeff wrote: > I'd say the answer is no, you don't want to do that. OAI isn't a search > protocol, it's a simple harvesting protocol. If you really do need to search > your database by these fields you will need to use a different protocol such > a Z39.50 or SRU/SRW and use it to index those fields from your research > project records. Also keep in mind that the main reason people make your > metadata records available via OAI is so others (aka service providers) can > make them useful and searchable in this way. > > Basically, it sounds like you want more functionality than OAI alone > provides. Check out EPrints or DSpace if you need a more complete archiving > solution. > > Jeff > > > -----Original Message----- > > From: Venugopal R Pally [mailto:pally_reddy@yahoo.com] > > Sent: Monday, April 14, 2003 11:50 AM > > To: oai-implementers@oaisrv.nsdl.cornell.edu > > Subject: [OAI-implementers] OAI Resource > > > > > > Hi all, > > The OAI says that 'resource' is the object or stuff > > that metadata is about. So, can resources include > > multiple types ? For example, in our case, I > > identified research projects as resources. But later I > > found that harvestors would like to search our archive > > based on certain other things like Author, his Papers > > etc. This would mean I should consider Authors, Paper > > titles also as resources along with research projects. > > So, when a harvestor asks for ListIdentifiers, can I > > display all of these (Research Projects, Authors, > > Paper Titles) ? Or should I use different > > metadataPrefix for different resources ? > > Thanks, > > Venu. > > > > __________________________________________________ > > Do you Yahoo!? > > Yahoo! Tax Center - File online, calculators, forms, and more > > http://tax.yahoo.com > > _______________________________________________ > > OAI-implementers mailing list > > OAI-implementers@oaisrv.nsdl.cornell.edu > > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > > > _______________________________________________ > OAI-implementers mailing list > OAI-implementers@oaisrv.nsdl.cornell.edu > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > From pally_reddy@yahoo.com Mon Apr 14 19:34:32 2003 From: pally_reddy@yahoo.com (Venugopal R Pally) Date: Mon, 14 Apr 2003 11:34:32 -0700 (PDT) Subject: [OAI-implementers] OAI Resource In-Reply-To: Message-ID: <20030414183432.41888.qmail@web40511.mail.yahoo.com> Thank you. As you said, Could you inform me how I can provide this at the service layer ? I have already implemented the OAI considering these research projects as Resources. But it would be of good use to my organization if I can extend it to considering certain other things as Resources. My initial idea was to use the same oai_dc metadataformat as schema for all these resources except that I will use only some of those elements in metadata of these different resources. For example, I need creator element of oai_dc for project but I dont need that element for Author etc. This way I would omit certain elements for these resources. Please suggest me if this is practical. Thanks, Venu. --- Simeon Warner wrote: > > I agree with Jeff and feel that overloading the > selective harvesting > mechanisms (sets, metadata formats) with search > functionality is not the > best way to approach these issues. You should either > use a protocol that > supports remote search, or provide that > functionality at the service layer > (think of the OAI repository as one layer down). > > Cheers, > Simeon. > > On Mon, 14 Apr 2003, Young,Jeff wrote: > > I'd say the answer is no, you don't want to do > that. OAI isn't a search > > protocol, it's a simple harvesting protocol. If > you really do need to search > > your database by these fields you will need to use > a different protocol such > > a Z39.50 or SRU/SRW and use it to index those > fields from your research > > project records. Also keep in mind that the main > reason people make your > > metadata records available via OAI is so others > (aka service providers) can > > make them useful and searchable in this way. > > > > Basically, it sounds like you want more > functionality than OAI alone > > provides. Check out EPrints or DSpace if you need > a more complete archiving > > solution. > > > > Jeff > > > > > -----Original Message----- > > > From: Venugopal R Pally > [mailto:pally_reddy@yahoo.com] > > > Sent: Monday, April 14, 2003 11:50 AM > > > To: oai-implementers@oaisrv.nsdl.cornell.edu > > > Subject: [OAI-implementers] OAI Resource > > > > > > > > > Hi all, > > > The OAI says that 'resource' is the object or > stuff > > > that metadata is about. So, can resources > include > > > multiple types ? For example, in our case, I > > > identified research projects as resources. But > later I > > > found that harvestors would like to search our > archive > > > based on certain other things like Author, his > Papers > > > etc. This would mean I should consider Authors, > Paper > > > titles also as resources along with research > projects. > > > So, when a harvestor asks for ListIdentifiers, > can I > > > display all of these (Research Projects, > Authors, > > > Paper Titles) ? Or should I use different > > > metadataPrefix for different resources ? > > > Thanks, > > > Venu. > > > > > > > __________________________________________________ > > > Do you Yahoo!? > > > Yahoo! Tax Center - File online, calculators, > forms, and more > > > http://tax.yahoo.com > > > _______________________________________________ > > > OAI-implementers mailing list > > > OAI-implementers@oaisrv.nsdl.cornell.edu > > > > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > > > > > _______________________________________________ > > OAI-implementers mailing list > > OAI-implementers@oaisrv.nsdl.cornell.edu > > > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > > > > _______________________________________________ > OAI-implementers mailing list > OAI-implementers@oaisrv.nsdl.cornell.edu > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - File online, calculators, forms, and more http://tax.yahoo.com From jyoung@oclc.org Mon Apr 14 20:11:26 2003 From: jyoung@oclc.org (Young,Jeff) Date: Mon, 14 Apr 2003 15:11:26 -0400 Subject: [OAI-implementers] OAI Resource Message-ID: I don't see how titles deserve to be separate resources, but I can sympathize with your desire to store authors as resources. For example, I have an old copy of the LC Name Authority File available that is accessible via OAI GetRecord verbs (e.g. http://alcme.oclc.org/laf/servlet/OAIHandler?verb=GetRecord&metadataPrefix=m arcxml&identifier=oai:laf.oclc.org/LCCN/n78-95332). So, you can retrieve any record in the file by substituting the LCCN for that person at the end of the URL. The biggest problem with this from OAI's point of view is that you can't honestly represent these records in Dublin Core (e.g. http://alcme.oclc.org/laf/servlet/OAIHandler?verb=GetRecord&metadataPrefix=o ai_dc&identifier=oai:laf.oclc.org/LCCN/n78-95332). Is "William Shakespeare" the dc.creator? The dc.title? Dublin Core is a bibliographic metadata format, and people just aren't bibliographic items. On the other hand, I don't claim that this repository is OAI compliant. It's just a convenient way to make the MARC21 XML data available to both browsers and automated processes. If you're really intent on creating records for people, you might consider doing something similar. Then, in your research records, you can create links from the dc.creator/dc.contributor/dc.publisher, etc, to these records via the available URL. This brings up another problem, though. There is no place in the Dublin Core schema to put these URLs. For example, Shakespeare, William,--1564-1616 To get around this, the ETDMS format, for example, extends the Dublin Core schema to include a resource attribute. Shakespeare, William... If you store your research project records this way, you can always dumb them down to Dublin Core by omitting the URL. If you do decide to store records for people, I'd suggest that there's no good reason to mix them in with your research paper records. Also keep in mind that various groups are dealing with schemes that will associate people with URIs, so in the long term, you may want to pick a solution that will allow you to utilize these services when they become available. Jeff > -----Original Message----- > From: Venugopal R Pally [mailto:pally_reddy@yahoo.com] > Sent: Monday, April 14, 2003 2:35 PM > To: Simeon Warner; oai-implementers@oaisrv.nsdl.cornell.edu > Subject: RE: [OAI-implementers] OAI Resource > > > Thank you. As you said, Could you inform me how I can > provide this at the service layer ? I have already > implemented the OAI considering these research > projects as Resources. But it would be of good use to > my organization if I can extend it to considering > certain other things as Resources. My initial idea was > to use the same oai_dc metadataformat as schema for > all these resources except that I will use only some > of those elements in metadata of these different > resources. For example, I need creator element of > oai_dc for project but I dont need that element for > Author etc. This way I would omit certain elements for > these resources. Please suggest me if this is > practical. > Thanks, > Venu. > > --- Simeon Warner wrote: > > > > I agree with Jeff and feel that overloading the > > selective harvesting > > mechanisms (sets, metadata formats) with search > > functionality is not the > > best way to approach these issues. You should either > > use a protocol that > > supports remote search, or provide that > > functionality at the service layer > > (think of the OAI repository as one layer down). > > > > Cheers, > > Simeon. > > > > On Mon, 14 Apr 2003, Young,Jeff wrote: > > > I'd say the answer is no, you don't want to do > > that. OAI isn't a search > > > protocol, it's a simple harvesting protocol. If > > you really do need to search > > > your database by these fields you will need to use > > a different protocol such > > > a Z39.50 or SRU/SRW and use it to index those > > fields from your research > > > project records. Also keep in mind that the main > > reason people make your > > > metadata records available via OAI is so others > > (aka service providers) can > > > make them useful and searchable in this way. > > > > > > Basically, it sounds like you want more > > functionality than OAI alone > > > provides. Check out EPrints or DSpace if you need > > a more complete archiving > > > solution. > > > > > > Jeff > > > > > > > -----Original Message----- > > > > From: Venugopal R Pally > > [mailto:pally_reddy@yahoo.com] > > > > Sent: Monday, April 14, 2003 11:50 AM > > > > To: oai-implementers@oaisrv.nsdl.cornell.edu > > > > Subject: [OAI-implementers] OAI Resource > > > > > > > > > > > > Hi all, > > > > The OAI says that 'resource' is the object or > > stuff > > > > that metadata is about. So, can resources > > include > > > > multiple types ? For example, in our case, I > > > > identified research projects as resources. But > > later I > > > > found that harvestors would like to search our > > archive > > > > based on certain other things like Author, his > > Papers > > > > etc. This would mean I should consider Authors, > > Paper > > > > titles also as resources along with research > > projects. > > > > So, when a harvestor asks for ListIdentifiers, > > can I > > > > display all of these (Research Projects, > > Authors, > > > > Paper Titles) ? Or should I use different > > > > metadataPrefix for different resources ? > > > > Thanks, > > > > Venu. > > > > > > > > > > __________________________________________________ > > > > Do you Yahoo!? > > > > Yahoo! Tax Center - File online, calculators, > > forms, and more > > > > http://tax.yahoo.com > > > > _______________________________________________ > > > > OAI-implementers mailing list > > > > OAI-implementers@oaisrv.nsdl.cornell.edu > > > > > > > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > > > > > > > _______________________________________________ > > > OAI-implementers mailing list > > > OAI-implementers@oaisrv.nsdl.cornell.edu > > > > > > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > > > > > > > _______________________________________________ > > OAI-implementers mailing list > > OAI-implementers@oaisrv.nsdl.cornell.edu > > > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > > > __________________________________________________ > Do you Yahoo!? > Yahoo! Tax Center - File online, calculators, forms, and more > http://tax.yahoo.com > _______________________________________________ > OAI-implementers mailing list > OAI-implementers@oaisrv.nsdl.cornell.edu > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > From jyoung@oclc.org Mon Apr 14 20:35:39 2003 From: jyoung@oclc.org (Young,Jeff) Date: Mon, 14 Apr 2003 15:35:39 -0400 Subject: [OAI-implementers] OAI Resource Message-ID: Venu, I guess I misunderstood about making "titles" a separate resource. After re-reading your note, I see that you're referring to storing information about research papers in addition to research projects. This would be a perfectly reasonable thing to do. You could differentiate the two by putting them in separate sets. Pick and choose Dublin Core elements according to the qualities for each type. If Dublin Core cramps your needs, you can always support a richer alternative format such as MARC in addition to DC. I think my earlier comments regarding people still apply, though. Jeff > -----Original Message----- > From: Young,Jeff [mailto:jyoung@oclc.org] > Sent: Monday, April 14, 2003 3:11 PM > To: 'Venugopal R Pally'; Simeon Warner; > oai-implementers@oaisrv.nsdl.cornell.edu > Subject: RE: [OAI-implementers] OAI Resource > > > I don't see how titles deserve to be separate resources, but I can > sympathize with your desire to store authors as resources. > For example, I > have an old copy of the LC Name Authority File available that > is accessible > via OAI GetRecord verbs (e.g. > http://alcme.oclc.org/laf/servlet/OAIHandler?verb=GetRecord&me > tadataPrefix=m > arcxml&identifier=oai:laf.oclc.org/LCCN/n78-95332). So, you > can retrieve any > record in the file by substituting the LCCN for that person > at the end of > the URL. > > The biggest problem with this from OAI's point of view is > that you can't > honestly represent these records in Dublin Core (e.g. > http://alcme.oclc.org/laf/servlet/OAIHandler?verb=GetRecord&me > tadataPrefix=o > ai_dc&identifier=oai:laf.oclc.org/LCCN/n78-95332). Is > "William Shakespeare" > the dc.creator? The dc.title? Dublin Core is a bibliographic metadata > format, and people just aren't bibliographic items. On the > other hand, I > don't claim that this repository is OAI compliant. It's just > a convenient > way to make the MARC21 XML data available to both browsers > and automated > processes. > > If you're really intent on creating records for people, you > might consider > doing something similar. Then, in your research records, you > can create > links from the dc.creator/dc.contributor/dc.publisher, etc, > to these records > via the available URL. > > This brings up another problem, though. There is no place in > the Dublin Core > schema to put these URLs. For example, > > Shakespeare, William,--1564-1616 > > To get around this, the ETDMS format, for example, extends > the Dublin Core > schema to include a resource attribute. > > Shakespeare, > William... > > If you store your research project records this way, you can > always dumb > them down to Dublin Core by omitting the URL. > > If you do decide to store records for people, I'd suggest > that there's no > good reason to mix them in with your research paper records. > Also keep in > mind that various groups are dealing with schemes that will > associate people > with URIs, so in the long term, you may want to pick a > solution that will > allow you to utilize these services when they become available. > > Jeff > > > -----Original Message----- > > From: Venugopal R Pally [mailto:pally_reddy@yahoo.com] > > Sent: Monday, April 14, 2003 2:35 PM > > To: Simeon Warner; oai-implementers@oaisrv.nsdl.cornell.edu > > Subject: RE: [OAI-implementers] OAI Resource > > > > > > Thank you. As you said, Could you inform me how I can > > provide this at the service layer ? I have already > > implemented the OAI considering these research > > projects as Resources. But it would be of good use to > > my organization if I can extend it to considering > > certain other things as Resources. My initial idea was > > to use the same oai_dc metadataformat as schema for > > all these resources except that I will use only some > > of those elements in metadata of these different > > resources. For example, I need creator element of > > oai_dc for project but I dont need that element for > > Author etc. This way I would omit certain elements for > > these resources. Please suggest me if this is > > practical. > > Thanks, > > Venu. > > > > --- Simeon Warner wrote: > > > > > > I agree with Jeff and feel that overloading the > > > selective harvesting > > > mechanisms (sets, metadata formats) with search > > > functionality is not the > > > best way to approach these issues. You should either > > > use a protocol that > > > supports remote search, or provide that > > > functionality at the service layer > > > (think of the OAI repository as one layer down). > > > > > > Cheers, > > > Simeon. > > > > > > On Mon, 14 Apr 2003, Young,Jeff wrote: > > > > I'd say the answer is no, you don't want to do > > > that. OAI isn't a search > > > > protocol, it's a simple harvesting protocol. If > > > you really do need to search > > > > your database by these fields you will need to use > > > a different protocol such > > > > a Z39.50 or SRU/SRW and use it to index those > > > fields from your research > > > > project records. Also keep in mind that the main > > > reason people make your > > > > metadata records available via OAI is so others > > > (aka service providers) can > > > > make them useful and searchable in this way. > > > > > > > > Basically, it sounds like you want more > > > functionality than OAI alone > > > > provides. Check out EPrints or DSpace if you need > > > a more complete archiving > > > > solution. > > > > > > > > Jeff > > > > > > > > > -----Original Message----- > > > > > From: Venugopal R Pally > > > [mailto:pally_reddy@yahoo.com] > > > > > Sent: Monday, April 14, 2003 11:50 AM > > > > > To: oai-implementers@oaisrv.nsdl.cornell.edu > > > > > Subject: [OAI-implementers] OAI Resource > > > > > > > > > > > > > > > Hi all, > > > > > The OAI says that 'resource' is the object or > > > stuff > > > > > that metadata is about. So, can resources > > > include > > > > > multiple types ? For example, in our case, I > > > > > identified research projects as resources. But > > > later I > > > > > found that harvestors would like to search our > > > archive > > > > > based on certain other things like Author, his > > > Papers > > > > > etc. This would mean I should consider Authors, > > > Paper > > > > > titles also as resources along with research > > > projects. > > > > > So, when a harvestor asks for ListIdentifiers, > > > can I > > > > > display all of these (Research Projects, > > > Authors, > > > > > Paper Titles) ? Or should I use different > > > > > metadataPrefix for different resources ? > > > > > Thanks, > > > > > Venu. > > > > > > > > > > > > > __________________________________________________ > > > > > Do you Yahoo!? > > > > > Yahoo! Tax Center - File online, calculators, > > > forms, and more > > > > > http://tax.yahoo.com > > > > > _______________________________________________ > > > > > OAI-implementers mailing list > > > > > OAI-implementers@oaisrv.nsdl.cornell.edu > > > > > > > > > > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > > > > > > > > > _______________________________________________ > > > > OAI-implementers mailing list > > > > OAI-implementers@oaisrv.nsdl.cornell.edu > > > > > > > > > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > > > > > > > > > > _______________________________________________ > > > OAI-implementers mailing list > > > OAI-implementers@oaisrv.nsdl.cornell.edu > > > > > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > > > > > > __________________________________________________ > > Do you Yahoo!? > > Yahoo! Tax Center - File online, calculators, forms, and more > > http://tax.yahoo.com > > _______________________________________________ > > OAI-implementers mailing list > > OAI-implementers@oaisrv.nsdl.cornell.edu > > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > > > _______________________________________________ > OAI-implementers mailing list > OAI-implementers@oaisrv.nsdl.cornell.edu > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > From pally_reddy@yahoo.com Mon Apr 14 21:53:48 2003 From: pally_reddy@yahoo.com (Venugopal R Pally) Date: Mon, 14 Apr 2003 13:53:48 -0700 (PDT) Subject: [OAI-implementers] OAI Resource In-Reply-To: Message-ID: <20030414205348.57429.qmail@web40512.mail.yahoo.com> Jeff, Thank you for your responses. Currently, the set hierarchy in our repository consists of different research SUBJECTS. So, Is it possible, I mean practical with respect to OAI, for me to include Paper titles, Authors also as OAI Sets (set hierarchy) along with research subjects ? Thanks, Venu. --- "Young,Jeff" wrote: > Venu, > > I guess I misunderstood about making "titles" a > separate resource. After > re-reading your note, I see that you're referring to > storing information > about research papers in addition to research > projects. This would be a > perfectly reasonable thing to do. You could > differentiate the two by putting > them in separate sets. Pick and choose Dublin Core > elements according to the > qualities for each type. If Dublin Core cramps your > needs, you can always > support a richer alternative format such as MARC in > addition to DC. I think > my earlier comments regarding people still apply, > though. > > Jeff __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - File online, calculators, forms, and more http://tax.yahoo.com From caar@loc.gov Mon Apr 14 22:31:25 2003 From: caar@loc.gov (Caroline Arms) Date: Mon, 14 Apr 2003 17:31:25 -0400 (EDT) Subject: [OAI-implementers] OAI Resource In-Reply-To: Message-ID: Venu, I agree with the earlier respondents. OAI-PMH is a mechanism for exchanging (but not searching) metadata. If your local application needs to hold and support searching for information about people, that is likely to be outside OAI-PMH entirely. However, if you are also looking to exchange metadata about people among applications/services, you may be able to use OAI-PMH. Useful metadata about people (whether you call them authors, agents, parties, or whatever) is going to be different from useful metadata about document-like information resources. Even though the DCMI now says that, 'Here an information resource is defined to be "anything that has identity".' the original elements (used for the OAI mandatory set) were definitely developed for "document-like" objects. Squeezing information about people into an unqualified Dublin Core record is unlikely to be useful. As Jeff points out, since OAI-PMH allows you to use other metadata formats, you can use it to exchange records that describe people if the parties involved in the exchange can agree on a format. The mandatory DC record can be minimal, its only useful purpose being as a conduit to a "full" record in a more appropriate schema. Apart from MARC Name Authority Records in the marc21 "slim" schema, I am not familiar with an XML Schema in common use for describing people. I just found http://www.numerata.com/vcardschema.htm but vCard may not have the elements that are of interest in your application. There are at least two more activities that I can think of that are looking into records for people. However, neither has reached the stage of having a schema, as far as I know. 1. DCMI Agents Working Group http://www.dublincore.org/groups/agents/ "Agents" include Creator/Contributor (and possibly Publisher) from the primary DC Element Set. 2. InterParty http://www.interparty.org/ The InterParty project is funded under the European Commission's Information Society Technologies Programme (IST), to design and specify a network to support interoperability of party identification (for both natural and corporate names) across different domains. InterParty builds on the work of the project, one of whose deliverables was a specification for a Directory of Parties [http://www.indecs.org/pdf/DirectoryofParties.pdf]. InterParty is not proposed as a replacement for existing schemes for the identification of participants in the intellectual property domain (e.g. national library name authority files or systems oriented towards the needs of rights licensing) but as a means of effecting their interoperation. http://www.interparty.org/ If you really are looking to exchange records about people, perhaps others on the mailing list know of projects involving appropriate schemas or element sets. Caroline Arms caar@loc.gov Office of Strategic Initiatives Library of Congress == Opinions expressed are my own. == On Mon, 14 Apr 2003, Young,Jeff wrote: > I don't see how titles deserve to be separate resources, but I can > sympathize with your desire to store authors as resources. For example, I > have an old copy of the LC Name Authority File available that is accessible > via OAI GetRecord verbs (e.g. > http://alcme.oclc.org/laf/servlet/OAIHandler?verb=GetRecord&metadataPrefix=m > arcxml&identifier=oai:laf.oclc.org/LCCN/n78-95332). So, you can retrieve any > record in the file by substituting the LCCN for that person at the end of > the URL. > > The biggest problem with this from OAI's point of view is that you can't > honestly represent these records in Dublin Core (e.g. > http://alcme.oclc.org/laf/servlet/OAIHandler?verb=GetRecord&metadataPrefix=o > ai_dc&identifier=oai:laf.oclc.org/LCCN/n78-95332). Is "William Shakespeare" > the dc.creator? The dc.title? Dublin Core is a bibliographic metadata > format, and people just aren't bibliographic items. On the other hand, I > don't claim that this repository is OAI compliant. It's just a convenient > way to make the MARC21 XML data available to both browsers and automated > processes. > > If you're really intent on creating records for people, you might consider > doing something similar. Then, in your research records, you can create > links from the dc.creator/dc.contributor/dc.publisher, etc, to these records > via the available URL. > > This brings up another problem, though. There is no place in the Dublin Core > schema to put these URLs. For example, > > Shakespeare, William,--1564-1616 > > To get around this, the ETDMS format, for example, extends the Dublin Core > schema to include a resource attribute. > > Shakespeare, William... > > If you store your research project records this way, you can always dumb > them down to Dublin Core by omitting the URL. > > If you do decide to store records for people, I'd suggest that there's no > good reason to mix them in with your research paper records. Also keep in > mind that various groups are dealing with schemes that will associate people > with URIs, so in the long term, you may want to pick a solution that will > allow you to utilize these services when they become available. > > Jeff > > > -----Original Message----- > > From: Venugopal R Pally [mailto:pally_reddy@yahoo.com] > > Sent: Monday, April 14, 2003 2:35 PM > > To: Simeon Warner; oai-implementers@oaisrv.nsdl.cornell.edu > > Subject: RE: [OAI-implementers] OAI Resource > > > > > > Thank you. As you said, Could you inform me how I can > > provide this at the service layer ? I have already > > implemented the OAI considering these research > > projects as Resources. But it would be of good use to > > my organization if I can extend it to considering > > certain other things as Resources. My initial idea was > > to use the same oai_dc metadataformat as schema for > > all these resources except that I will use only some > > of those elements in metadata of these different > > resources. For example, I need creator element of > > oai_dc for project but I dont need that element for > > Author etc. This way I would omit certain elements for > > these resources. Please suggest me if this is > > practical. > > Thanks, > > Venu. > > > > --- Simeon Warner wrote: > > > > > > I agree with Jeff and feel that overloading the > > > selective harvesting > > > mechanisms (sets, metadata formats) with search > > > functionality is not the > > > best way to approach these issues. You should either > > > use a protocol that > > > supports remote search, or provide that > > > functionality at the service layer > > > (think of the OAI repository as one layer down). > > > > > > Cheers, > > > Simeon. > > > > > > On Mon, 14 Apr 2003, Young,Jeff wrote: > > > > I'd say the answer is no, you don't want to do > > > that. OAI isn't a search > > > > protocol, it's a simple harvesting protocol. If > > > you really do need to search > > > > your database by these fields you will need to use > > > a different protocol such > > > > a Z39.50 or SRU/SRW and use it to index those > > > fields from your research > > > > project records. Also keep in mind that the main > > > reason people make your > > > > metadata records available via OAI is so others > > > (aka service providers) can > > > > make them useful and searchable in this way. > > > > > > > > Basically, it sounds like you want more > > > functionality than OAI alone > > > > provides. Check out EPrints or DSpace if you need > > > a more complete archiving > > > > solution. > > > > > > > > Jeff > > > > > > > > > -----Original Message----- > > > > > From: Venugopal R Pally > > > [mailto:pally_reddy@yahoo.com] > > > > > Sent: Monday, April 14, 2003 11:50 AM > > > > > To: oai-implementers@oaisrv.nsdl.cornell.edu > > > > > Subject: [OAI-implementers] OAI Resource > > > > > > > > > > > > > > > Hi all, > > > > > The OAI says that 'resource' is the object or > > > stuff > > > > > that metadata is about. So, can resources > > > include > > > > > multiple types ? For example, in our case, I > > > > > identified research projects as resources. But > > > later I > > > > > found that harvestors would like to search our > > > archive > > > > > based on certain other things like Author, his > > > Papers > > > > > etc. This would mean I should consider Authors, > > > Paper > > > > > titles also as resources along with research > > > projects. > > > > > So, when a harvestor asks for ListIdentifiers, > > > can I > > > > > display all of these (Research Projects, > > > Authors, > > > > > Paper Titles) ? Or should I use different > > > > > metadataPrefix for different resources ? > > > > > Thanks, > > > > > Venu. > > > > > From jyoung@oclc.org Tue Apr 15 14:02:35 2003 From: jyoung@oclc.org (Young,Jeff) Date: Tue, 15 Apr 2003 09:02:35 -0400 Subject: [OAI-implementers] OAI Resource Message-ID: Venu, Yes and no. Yes in the sense that you could create sets hierarchies that look like this: Subject:History:European:Middle+Ages Name:Shakespeare,+William,--1564-1616 Title:A+is+for+Apple No in the sense that no one will ever use your sets if you do this. Take the "Name" hierarchy, for example. Even if someone did want to harvest only those records written by William Shakespeare, they would still have to match the set string exactly (i.e. "Name:Shakespeare,+William,--1564-1616") or no records will result. If your setSpec has two dashes separating the name from the date, one dash in the request won't cut it. The user doesn't know the birth and death date? Sorry. The same goes for the "Title" hierarchy. Your target audience might get some value from the "Subject" hierarchy, but only if it's not too complicated. In general, I wouldn't create OAI sets unless the folks harvesting it demanded them. Why make it hard on yourself if nobody cares? Other than a simple set of subjects, I only see two other sets that OAI clients are likely to care about: Type:Projects and Type:Papers (or something similar). If you want your users to be able to do useful searches by title, subject, or author, though, don't expect OAI sets to satisfy them. Find some other way to give them this capability such as Z39.50 or SRU/SRW or EPrints or DSpace. Jeff > -----Original Message----- > From: Venugopal R Pally [mailto:pally_reddy@yahoo.com] > Sent: Monday, April 14, 2003 4:54 PM > To: Young,Jeff; Simeon Warner; > oai-implementers@oaisrv.nsdl.cornell.edu > Subject: RE: [OAI-implementers] OAI Resource > > > Jeff, > Thank you for your responses. > Currently, the set hierarchy in our repository > consists of different research SUBJECTS. So, Is it > possible, I mean practical with respect to OAI, for me > to include Paper titles, Authors also as OAI Sets (set > hierarchy) along with research subjects ? > Thanks, > Venu. > > > --- "Young,Jeff" wrote: > > Venu, > > > > I guess I misunderstood about making "titles" a > > separate resource. After > > re-reading your note, I see that you're referring to > > storing information > > about research papers in addition to research > > projects. This would be a > > perfectly reasonable thing to do. You could > > differentiate the two by putting > > them in separate sets. Pick and choose Dublin Core > > elements according to the > > qualities for each type. If Dublin Core cramps your > > needs, you can always > > support a richer alternative format such as MARC in > > addition to DC. I think > > my earlier comments regarding people still apply, > > though. > > > > Jeff > > > __________________________________________________ > Do you Yahoo!? > Yahoo! Tax Center - File online, calculators, forms, and more > http://tax.yahoo.com > From deridder@cs.utk.edu Tue Apr 15 15:10:55 2003 From: deridder@cs.utk.edu (deridder) Date: Tue, 15 Apr 2003 10:10:55 -0400 (EDT) Subject: [OAI-implementers] dlxs implementation? Message-ID: Is anyone out there using the University of Michigan's DLXS broker for their OAI repository implementation? And/or their OAIster harvester? We are already using their DLXS software for our digital library searching and display, and are considering switching to their OAI support as well, since it's part of the upgrade anyway. I'd like to hear about your experiences if you have done this also. Thanks!! Jody DeRidder IT Administrator II Digital Library Center 647 John C. Hodges Library University of Tennessee Knoxville, TN 37996 Phone: (865) 974-4796 Email: deridder@aztec.lib.utk.edu From khage@umich.edu Tue Apr 15 15:43:02 2003 From: khage@umich.edu (Kat Hagedorn) Date: Tue, 15 Apr 2003 10:43:02 -0400 Subject: [OAI-implementers] dlxs implementation? In-Reply-To: Message-ID: <8F59AA84-6F50-11D7-AFB6-0003934CA344@umich.edu> I just wanted to mention that the harvester that OAIster uses was developed by UIUC. You can see/download their software at: http://oai.grainger.uiuc.edu/harvester.htm - Kat On Tuesday, Apr 15, 2003, at 10:10 America/Detroit, deridder wrote: > > Is anyone out there using the University of Michigan's DLXS broker > for > their OAI repository implementation? And/or their OAIster harvester? > We are already using their DLXS software for our digital library > searching > and display, and are considering switching to their OAI support as > well, > since it's part of the upgrade anyway. I'd like to hear about your > experiences if you have done this also. > > Thanks!! > > > Jody DeRidder > IT Administrator II > Digital Library Center > 647 John C. Hodges Library > University of Tennessee > Knoxville, TN 37996 > > Phone: (865) 974-4796 > Email: deridder@aztec.lib.utk.edu > > _______________________________________________ > OAI-implementers mailing list > OAI-implementers@oaisrv.nsdl.cornell.edu > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > ------------------- Kat Hagedorn OAIster/Metadata Harvesting Librarian DLXS Bibliographic Class Coordinator Digital Library Production Service University of Michigan http://www.oaister.org/ http://www.dlxs.org/ email: khage@umich.edu phone: 734-615-7618 From robert.tansley@hp.com Thu Apr 17 16:30:36 2003 From: robert.tansley@hp.com (Tansley, Robert) Date: Thu, 17 Apr 2003 08:30:36 -0700 Subject: [OAI-implementers] DSpace OAI-PMH Support Testers Message-ID: <40700B4C02ABD5119F00009027876644EA3863@hplex1.hpl.hp.com> Hi all, Hopefully, we've sorted out all the OAI-PMH issues with DSpace for the next release of the system. The Unicode problems should be fixed, and we now support resumption tokens. We've put up a beta version, and would really appreciate it if one or more people could have a go at harvesting it to make sure any problems are ironed out. The data is not 'real' data so you probably won't want it to show in your end user's searches! http://hpds3.mit.edu/oai/ Please reply to me directly if you find any errors or problems. If you find that everything works fine I'd like to know too! Thanks all, Robert Tansley / Hewlett-Packard Laboratories / (+1) 617 551 7624 From krot@umich.edu Tue Apr 22 13:40:56 2003 From: krot@umich.edu (Michael Krot) Date: Tue, 22 Apr 2003 08:40:56 -0400 Subject: [OAI-implementers] User Specific Archive Access In-Reply-To: <200301041701.h04H13p01588@nsdlib.nsdl.cornell.edu> References: <200301041701.h04H13p01588@nsdlib.nsdl.cornell.edu> Message-ID: <3EA53858.1010301@umich.edu> Hi. I am in the beginning stages of architecting an implementation of OAI that will provide user-specific access to our archives and was wondering if anyone knew of any other projects that have tackled this issue. As far as background information, we are an archive with about 400 journal titles and millions of records. Users will only be able to harvest metadata for those journals that they have subscribed to - that should be a problem I can handle. Things get tricky when we start to look at what is available WITHIN the journal. Some users will have access to the entire journal from start to finish, other users will only have access to records up until a certain year (usually 5 years before the present year). This isn't so hard to manage either, until you start to think about a user asking for new or changed content. What is "new" or changed will be different for each harvester, depending on their access rights. One possible solution to this problem is to simply give them the metadata and flag it as being unavailable in some way, but this is not an ideal solution. An ideal solution is to restrict access to the metadata until such time as they are permitted to harvest it. Well, I don't expect anyone else to have encountered this specific problem, but if anyone has experimented with limiting access to an archive for specific users or user-groups, I would love to know about it. Thanks! Michael Krot Data Manager JSTOR From krichel@openlib.org Tue Apr 22 14:25:29 2003 From: krichel@openlib.org (Thomas Krichel) Date: Tue, 22 Apr 2003 08:25:29 -0500 Subject: [OAI-implementers] User Specific Archive Access In-Reply-To: <3EA53858.1010301@umich.edu> References: <200301041701.h04H13p01588@nsdlib.nsdl.cornell.edu> <3EA53858.1010301@umich.edu> Message-ID: <20030422132529.GG31079@openlib.org> Michael Krot writes > Users will only be able to harvest metadata for those journals that > they have subscribed to - that should be a problem I can handle. > Things get tricky when we start to look at what is available WITHIN > the journal. Some users will have access to the entire journal from > start to finish, other users will only have access to records up > until a certain year (usually 5 years before the present year). In the OAI model, users do not havest. Service providers harvest. Users use service providers. Service providers (or intermediate collections such as RePEc) will need to have whole set of metadata, irrespective of end users, because if they have no subscription, they have no data to start to provide a service with. Service providers will have to warn users that content access may depend on them having a subscription. BTW, RePEc already includes JSTOR links. Richer metadata from JSTOR would be much welcome. In particular, we would be interested in full-text links for various types of full text, and in classification data, and precise citation data, as precise as you have captured it. Cheers, Thomas Krichel mailto:krichel@openlib.org http://openlib.org/home/krichel RePEc:per:1965-06-05:thomas_krichel From tim@tim.brody.btinternet.co.uk Tue Apr 22 15:01:34 2003 From: tim@tim.brody.btinternet.co.uk (Tim Brody) Date: Tue, 22 Apr 2003 15:01:34 +0100 Subject: [OAI-implementers] User Specific Archive Access References: <200301041701.h04H13p01588@nsdlib.nsdl.cornell.edu> <3EA53858.1010301@umich.edu> Message-ID: <015101c308d7$af869710$14414e98@Shrek> Hi, If I understand your problem as follows (e.g.): An article in Journal "XML Matters" appears in 2003. User "John" has complete access to the journal and harvests the article in 2003. User "Daisy" has access only after one year, so in 2003 is harvesting articles from 2002. But when Daisy asks for new articles in 2004 she won't get the article from 2003 because she is using the datestamp ">=2004". I think the solution to this is to change the datestamp in the user's OAI request to be datestamp-(user's period before access), e.g. In 2003 Daisy requests all records >=2003 - (1 year delay) = return all records >=2002 Before any records are returned to the user you will need to check their rights to that record (in a journal they are subscribed to, date of the article appropriate to their subscription). I've written this as I've worked it out, so please excuse any mistakes! All the best, Tim. ----- Original Message ----- From: "Michael Krot" To: Sent: Tuesday, April 22, 2003 1:40 PM Subject: [OAI-implementers] User Specific Archive Access > Hi. > > I am in the beginning stages of architecting an implementation of OAI > that will provide user-specific access to our archives and was wondering > if anyone knew of any other projects that have tackled this issue. > > As far as background information, we are an archive with about 400 > journal titles and millions of records. Users will only be able to > harvest metadata for those journals that they have subscribed to - that > should be a problem I can handle. Things get tricky when we start to > look at what is available WITHIN the journal. Some users will have > access to the entire journal from start to finish, other users will only > have access to records up until a certain year (usually 5 years before > the present year). This isn't so hard to manage either, until you start > to think about a user asking for new or changed content. What is "new" > or changed will be different for each harvester, depending on their > access rights. > > One possible solution to this problem is to simply give them the > metadata and flag it as being unavailable in some way, but this is not > an ideal solution. An ideal solution is to restrict access to the > metadata until such time as they are permitted to harvest it. > > Well, I don't expect anyone else to have encountered this specific > problem, but if anyone has experimented with limiting access to an > archive for specific users or user-groups, I would love to know about it. > > Thanks! > Michael Krot > Data Manager > JSTOR > > _______________________________________________ > OAI-implementers mailing list > OAI-implementers@oaisrv.nsdl.cornell.edu > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers From stamer@uni-oldenburg.de Tue Apr 22 15:51:05 2003 From: stamer@uni-oldenburg.de (Heinrich Stamerjohanns) Date: Tue, 22 Apr 2003 16:51:05 +0200 (CEST) Subject: [OAI-implementers] User Specific Archive Access In-Reply-To: <3EA53858.1010301@umich.edu> Message-ID: On Tue, 22 Apr 2003, Michael Krot wrote: > access to the entire journal from start to finish, other users will only > have access to records up until a certain year (usually 5 years before > the present year). This isn't so hard to manage either, until you start > to think about a user asking for new or changed content. What is "new" > or changed will be different for each harvester, depending on their > access rights. To handle access rights for the content stuff is probably beyond the scope of metadata harvesting (see also below) > > One possible solution to this problem is to simply give them the > metadata and flag it as being unavailable in some way, but this is not > an ideal solution. An ideal solution is to restrict access to the > metadata until such time as they are permitted to harvest it. I do not think this is an optimal solution. Nobody will know that there is some important information out there, but one is just unable to access the content. Nobody will ask you then how this information might become accessible. You should share the metadata, but possibly restrict access to content. > Well, I don't expect anyone else to have encountered this specific > problem, but if anyone has experimented with limiting access to an > archive for specific users or user-groups, I would love to know about it. You can easily restrict access to your Data-Provider on HTTP level, by using e.g. IP restriction-based access. You can also restrict the use by a password (.htaccess for Apache..). The URL is then e.g. http://user:password@www.myarchive.org/oai?verb=.... in order to access the Data-Provider. Heinrich -- Dr. Heinrich Stamerjohanns Tel. +49-441-798-4276 Institute for Science Networking stamer@uni-oldenburg.de University of Oldenburg http://isn.uni-oldenburg.de/~stamer From liu_x@lanl.gov Tue Apr 22 16:05:42 2003 From: liu_x@lanl.gov (Xiaoming Liu) Date: 22 Apr 2003 09:05:42 -0600 Subject: [OAI-implementers] User Specific Archive Access In-Reply-To: <3EA53858.1010301@umich.edu> References: <200301041701.h04H13p01588@nsdlib.nsdl.cornell.edu> <3EA53858.1010301@umich.edu> Message-ID: <1051023942.31817.18.camel@intrepid> On Tue, 2003-04-22 at 06:40, Michael Krot wrote: > > As far as background information, we are an archive with about 400 > journal titles and millions of records. Users will only be able to > harvest metadata for those journals that they have subscribed to - that > should be a problem I can handle. Things get tricky when we start to > look at what is available WITHIN the journal. Some users will have > access to the entire journal from start to finish, other users will only > have access to records up until a certain year (usually 5 years before > the present year). This isn't so hard to manage either, until you start > to think about a user asking for new or changed content. What is "new" > or changed will be different for each harvester, depending on their > access rights. One possible solution is to create virtual repository for each user (the institutional subscriber), such as http://an.org/{userid}/oai, since automatic URL mapping can be used, this task may not be very hard. For each user you may have a profile which maintains access right. xiaoming liu > > One possible solution to this problem is to simply give them the > metadata and flag it as being unavailable in some way, but this is not > an ideal solution. An ideal solution is to restrict access to the > metadata until such time as they are permitted to harvest it. > > Well, I don't expect anyone else to have encountered this specific > problem, but if anyone has experimented with limiting access to an > archive for specific users or user-groups, I would love to know about it. > > Thanks! > Michael Krot > Data Manager > JSTOR > > _______________________________________________ > OAI-implementers mailing list > OAI-implementers@oaisrv.nsdl.cornell.edu > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > From francois@fsconsult.com Tue Apr 22 17:46:07 2003 From: francois@fsconsult.com (Fran=?ISO-8859-1?B?5w==?=ois Schiettecatte) Date: Tue, 22 Apr 2003 12:46:07 -0400 Subject: [OAI-implementers] User Specific Archive Access In-Reply-To: <3EA53858.1010301@umich.edu> Message-ID: Michael There are two sides to this problem. The first side is access control: - You can use IP based restrictions, and map those IP addresses to a set of rights. This has the drawback that the harvesters could not be switched from one machine to another without an intervention on your part. - You can use standard user name/password (using .htaccess) and map those user names to a set of rights. This is a little more flexible but a little more insecure. - You could use user name/password over an SSL connection, again mapping those user names to a set of rights. The second side is below: On 4/22/03 8:40 AM, "Michael Krot" wrote: > Hi. > > I am in the beginning stages of architecting an implementation of OAI > that will provide user-specific access to our archives and was wondering > if anyone knew of any other projects that have tackled this issue. > > As far as background information, we are an archive with about 400 > journal titles and millions of records. Users will only be able to > harvest metadata for those journals that they have subscribed to - that > should be a problem I can handle. Things get tricky when we start to > look at what is available WITHIN the journal. Some users will have > access to the entire journal from start to finish, other users will only > have access to records up until a certain year (usually 5 years before > the present year). This isn't so hard to manage either, until you start > to think about a user asking for new or changed content. What is "new" > or changed will be different for each harvester, depending on their > access rights. > > One possible solution to this problem is to simply give them the > metadata and flag it as being unavailable in some way, but this is not > an ideal solution. An ideal solution is to restrict access to the > metadata until such time as they are permitted to harvest it. Personally I would not expose metadata to which the users have no right to unless they can easily get those rights, ie buy the article. I have been very frustrated in the past when using systems, only to be told that I could not have the article and there was not way for me to buy it. But if there is a simple way for users to buy an article on your system, you might want to consider making all the metadata available to give users a choice in search service for searching the material. This would be of interest to me as I run the myOAI search service (http://www.myoai.com/). In the past I built a system called ScienceServer which had a subscription component which did just that. Each user/institution was assigned a set of subscriptions and all they saw were what they paid to access. This allowed a consortium to maintain a single collection, serving individual members of that consortium a virtual collection of the journals they had subscribed to. This is complex but quite doable, and if you are looking for a consultant to help you with this, I would be more than happy to do so. > Well, I don't expect anyone else to have encountered this specific > problem, but if anyone has experimented with limiting access to an > archive for specific users or user-groups, I would love to know about it. > > Thanks! > Michael Krot > Data Manager > JSTOR > > _______________________________________________ > OAI-implementers mailing list > OAI-implementers@oaisrv.nsdl.cornell.edu > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > ======================================================================== François Schiettecatte FS Consulting, Inc. Phone : (978) 594-5089 35 Washington Square North, # 2, Cell : (617) 909-2504 Salem, MA, 01970 Email : francois@fsconsult.com URL : http://www.fsconsult.com/ ======================================================================== From ramon@ibict.br Tue Apr 22 17:52:06 2003 From: ramon@ibict.br (Ramon Martins Sodoma da Fonseca) Date: Tue, 22 Apr 2003 13:52:06 -0300 Subject: [OAI-implementers] set spect error Message-ID: <3DB871E13177D61198850000E222F2EB24748E@ebano2.ibict.br> Dear fellow implementers.. we have been working on the protocol and doing the fine tuning (I hope) we had the setSpec work, but when I click on the set link on the OAI Explorer, which I believe shows all the records of that set, it shows me the following error " Parsed Output XML Schema Validation Error ! [Error] fileS0qUiQ:10:20: The content of element type "ListIdentifiers" must match "(header+,resumptionToken?)". /tmp/fileS0qUiQ: 1106 ms (5 elems, 8 attrs, 9 spaces, 77 chars) XML Output 2003-04-22T16:43:58Z http://www.ibict.br/cgi-bin/ibict/oai.pl cience!!!oai_dc!3 " this is how it shows when asking for the listSets verb: " List of Sets Click on the link to list the contents Ciência e Informação <> set description: dc: description: Teses e Dissertações Eletrônicas sobre ciência e ciência da informação Request : http://www.ibict.br/cgi-bin/ibict/oai.pl, verb=ListSets Response Date : 2003-04-22T16:41:01Z " this is the base url for testing: http://www.ibict.br/cgi-bin/ibict/oai.pl any input is welcome... thank you ............................................................................ ................................................. Ramón Martins S. da Fonseca Desenvolvimento Web IBICT - Instituto Brasileiro de Informação em Ciência e Tecnologia +55 61 217 6443 / 6347 ............................................................................ ................................................. From krot@umich.edu Tue Apr 22 18:41:34 2003 From: krot@umich.edu (Michael Krot) Date: Tue, 22 Apr 2003 13:41:34 -0400 Subject: [OAI-implementers] User Specific Archive Access In-Reply-To: <1051023942.31817.18.camel@intrepid> References: <200301041701.h04H13p01588@nsdlib.nsdl.cornell.edu> <3EA53858.1010301@umich.edu> <1051023942.31817.18.camel@intrepid> Message-ID: <3EA57ECE.6030002@umich.edu> Hi all, Thanks to all for your input - it is very nice to get worthwhile feedback so quickly! I realize that access restriction is beyond the scope of the OAI standard, but it is unforntunately a messy part of life over here. I'm not so interested in the how of access restriction, ip recognition, controlling rights, etc. - these things I can handle. The interesting part to me is managing this huge metadata repository in such a way as to provide the metadata I want a user to see given the constraints of the OAI standard. I will try to address some of the questions you all have raised: 1) Mr. Krichel made some comments about service providers and subscriptions to JSTOR. JSTOR actually has more user groups than just service providers. We also deal directly with Libraries (who may want the metadata to create their own search engines), Publishers (who will want the metadata for an entire run of a journal and have no access restrictions at all), and other business partners who want access to our metadata. Having such diverse groups with varying technical skills raises a number of issues - among them is how can we get the user the metadata that only they want/need and what are the implications in regards to OAI selective harvesting rules. 2) As far as sharing ALL our metadata - this would greatly simplify my life in regards to this issue, but it is a business decision that is out of my hands. I would still restrict by Journal, so users would only get metadata for those journals they subscribe to, but I would let them see ALL content for that journal including content that is not yet available on the public site (usually due to agreements with publishers). These records could be flagged as "not yet publically available" and consequently screened out by the end user. There are two major problems I see with this approach: a) Metadata has some inherent value to it. What's stopping someone from providing links to other content providers using our metadata to point to other providers? Perhaps this question could be worked out in a legal metadata sharing contract. I said before this is a business decision that is out of my hands for now... b) Users may not want to screen out large chunks of content that they can't yet see. I'm already worried about the technical barriers that using OAI may provide for some of our less technically inclined partners, this might further complicate the process for them. Yeah, yeah...information wants to be free...I know that song and dance. It certainly would make life easier, but I'm not sure it makes good policy., 3) Do I have this right that the "creation date" for a given object is subjective? That is, does the creation date refer to the date that this object became available to repository for that particular user? I'm guessing yes. If this is the case, we can potenially do some behind-the-sceens spoofing of the creation date to reflect the time that this record became available to the user. We would aslo have to spoof the modified date in the same way, so that no record had a modified date older than the creation date. This would be a fairly complex process and would require us to maintain information about what a user was able to see at a given point in time. It would also require us to gather data about the record such as the published date (this is how we restrict access) and the date the record was publically released. A difficult problem, but not impossible. 4) The virtual repository idea is interesting, but would likely be unmangeable if we start getting large amounts of users when are dealing with millions of records. Thanks to you all - I really appreciate your help! Michael Krot Data Manager JSTOR From GT.Peterson@att.net Wed Apr 23 06:07:53 2003 From: GT.Peterson@att.net (GT.Peterson@att.net) Date: Wed, 23 Apr 2003 05:07:53 +0000 Subject: [OAI-implementers] (no subject) Message-ID: <200304230507.h3N57v314688@nsdlib.nsdl.cornell.edu> remove from mailing list GT.Peterson@att.net From hussein@cs.uct.ac.za Wed Apr 23 10:17:21 2003 From: hussein@cs.uct.ac.za (Hussein Suleman) Date: Wed, 23 Apr 2003 11:17:21 +0200 Subject: [OAI-implementers] User Specific Archive Access References: <200301041701.h04H13p01588@nsdlib.nsdl.cornell.edu> <3EA53858.1010301@umich.edu> <1051023942.31817.18.camel@intrepid> <3EA57ECE.6030002@umich.edu> Message-ID: <3EA65A21.7030100@cs.uct.ac.za> hi some late comments: firstly, i think we should address a philosophical/theoretical issue that is hinted at by this problem. OAI-PMH is based on the premise that interoperability can best be promoted by shifting the "implementation burden" from the data providers to the service providers - making those who have a greater desire for interoperability pay the costs in terms of complexity. JSTOR appears to offer precisely the inverse scenario to the classical data/service provider split. if i am reading the thread correctly, JSTOR is the driving force behind this interoperability effort and, by my understanding, should therefore centrally handle the complexity and offer subscribing libraries the simplest possible interface (OAI-PMH?). that said, in the ideal case, a subscribing institution should get a cohesive view of their subcollection, independently of other subscribers. how could this work in practice? - do you need virtual data providers? i am not sure this is necessary - you should be able to use IP- or some other authentication and determine what data to make visible transparently - do you need to store additional data for each harvester? i hope not, as this will break some of the basic idempotence properties of OAI-PMH. if each record in your archive has "published" and "modified" dates, you could screen for accessible subsets on the basis of matching the published dates to subscription rules (on a per access basis of course), while allowing date-based harvesting on the basis of modification dates (with the provision that modification = max (modification, subscription)) ... i hope this makes sense :) - unsubscriptions are going to be tricky! if you expose metadata differently for different users, "deletions" may become a nightmare, so if possible i would suggest looking into not using the PMH's deletions feature. in any event, i think it is doable with an appropriately structured database, with a not-too-complex set of subscription rules and without additional storage or per-harvester data. ttfn, ----hussein Michael Krot wrote: > Hi all, > > Thanks to all for your input - it is very nice to get worthwhile > feedback so quickly! I realize that access restriction is beyond the > scope of the OAI standard, but it is unforntunately a messy part of life > over here. I'm not so interested in the how of access restriction, ip > recognition, controlling rights, etc. - these things I can handle. The > interesting part to me is managing this huge metadata repository in such > a way as to provide the metadata I want a user to see given the > constraints of the OAI standard. > > I will try to address some of the questions you all have raised: > > 1) Mr. Krichel made some comments about service providers and > subscriptions to JSTOR. JSTOR actually has more user groups than just > service providers. > We also deal directly with Libraries (who may want the metadata to > create their own search engines), Publishers (who will want the metadata > for an entire run of a journal and have no > access restrictions at all), and other business partners who want access > to our metadata. > > Having such diverse groups with varying technical skills raises a number > of issues - among them is how can we get the user the metadata that only > they want/need and what are the implications in regards to OAI selective > harvesting rules. > > 2) As far as sharing ALL our metadata - this would greatly simplify my > life in regards to this issue, but it is a business decision that is out > of my hands. I would still restrict by Journal, so users would only get > metadata for those journals they subscribe to, but I would let them see > ALL content for that journal including content that is not yet available > on the public site (usually due to agreements with publishers). These > records could be flagged as "not yet publically available" and > consequently screened out by the end user. > There are two major problems I see with this approach: > > a) Metadata has some inherent value to it. What's stopping someone > from providing links to other content providers using our metadata to > point to other providers? Perhaps this question could be worked out in > a legal metadata sharing contract. I said before this is a business > decision that is out of my hands for now... > b) Users may not want to screen out large chunks of content that they > can't yet see. I'm already worried about the technical barriers that > using OAI may provide for some of our less technically inclined > partners, this might further complicate the process for them. > > Yeah, yeah...information wants to be free...I know that song and dance. > It certainly would make life easier, but I'm not sure it makes good > policy., > > 3) Do I have this right that the "creation date" for a given object is > subjective? That is, does the creation date refer to the date that this > object became available to repository for that particular user? I'm > guessing yes. > If this is the case, we can potenially do some behind-the-sceens > spoofing of the creation date to reflect the time that this record > became available to the user. We would aslo have to spoof the modified > date in the same way, so that no record had a modified date older than > the creation date. This would be a fairly complex process and would > require us to maintain information about what a user was able to see at > a given point in time. It would also require us to gather data about > the record such as the published date (this is how we restrict access) > and the date the record was publically released. A difficult problem, > but not impossible. > > 4) The virtual repository idea is interesting, but would likely be > unmangeable if we start getting large amounts of users when are dealing > with millions of records. > > Thanks to you all - I really appreciate your help! > > Michael Krot > Data Manager > JSTOR > > > > _______________________________________________ > OAI-implementers mailing list > OAI-implementers@oaisrv.nsdl.cornell.edu > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers -- ===================================================================== hussein suleman ~ hussein@cs.uct.ac.za ~ http://www.husseinsspace.com ===================================================================== From tim@tim.brody.btinternet.co.uk Wed Apr 23 16:40:41 2003 From: tim@tim.brody.btinternet.co.uk (Tim Brody) Date: Wed, 23 Apr 2003 16:40:41 +0100 Subject: [OAI-implementers] set spect error References: <3DB871E13177D61198850000E222F2EB24748E@ebano2.ibict.br> Message-ID: <024901c309ae$b22da080$14414e98@Shrek> You aren't returning any records for your set: http://www.ibict.br/cgi-bin/ibict/oai.pl?verb=ListIdentifiers&set=cience&met adataPrefix=oai_dc Only a resumptionToken. The clue is in "(header+,resumptionToken?)", which is regular expression speak for match header 1 or more times, match resumption token 0 or 1 times. You should be returning a noRecordsMatch error if there are no records that match the OAI query. To sum-up: Every OAI response should either be an error, a complete list, or a partial list (records+resumption token). All the best, Tim. ----- Original Message ----- From: "Ramon Martins Sodoma da Fonseca" To: "Oai-Implementers (E-mail)" Sent: Tuesday, April 22, 2003 5:52 PM Subject: [OAI-implementers] set spect error Dear fellow implementers.. we have been working on the protocol and doing the fine tuning (I hope) we had the setSpec work, but when I click on the set link on the OAI Explorer, which I believe shows all the records of that set, it shows me the following error " Parsed Output XML Schema Validation Error ! [Error] fileS0qUiQ:10:20: The content of element type "ListIdentifiers" must match "(header+,resumptionToken?)". /tmp/fileS0qUiQ: 1106 ms (5 elems, 8 attrs, 9 spaces, 77 chars) XML Output 2003-04-22T16:43:58Z http://www.ibict.br/cgi-bin/ibict/oai.pl cience!!!oai_dc!3 " this is how it shows when asking for the listSets verb: " List of Sets Click on the link to list the contents Ciência e Informação <> set description: dc: description: Teses e Dissertações Eletrônicas sobre ciência e ciência da informação Request : http://www.ibict.br/cgi-bin/ibict/oai.pl, verb=ListSets Response Date : 2003-04-22T16:41:01Z " this is the base url for testing: http://www.ibict.br/cgi-bin/ibict/oai.pl any input is welcome... thank you ............................................................................ ................................................. Ramón Martins S. da Fonseca Desenvolvimento Web IBICT - Instituto Brasileiro de Informação em Ciência e Tecnologia +55 61 217 6443 / 6347 ............................................................................ ................................................. _______________________________________________ OAI-implementers mailing list OAI-implementers@oaisrv.nsdl.cornell.edu http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers From ramon@ibict.br Wed Apr 23 16:47:43 2003 From: ramon@ibict.br (Ramon Martins Sodoma da Fonseca) Date: Wed, 23 Apr 2003 12:47:43 -0300 Subject: RES: [OAI-implementers] set spect error Message-ID: <3DB871E13177D61198850000E222F2EB247492@ebano2.ibict.br> I noticed that, but I really don´t know what could be wrong. My guess is that even though we implemented a table with the sets, and each record has a set, they are not implemented correctly... thanks for everything.. ............................................................................ ... Ramón Martins S. da Fonseca Desenvolvimento Web IBICT - Instituto Brasileiro de Informação em Ciência e Tecnologia +55 61 217 6443 / 6347 ............................................................................ ... -----Mensagem original----- De: Tim Brody [mailto:tim@tim.brody.btinternet.co.uk] Enviada em: quarta-feira, 23 de abril de 2003 12:41 Para: Ramon Martins Sodoma da Fonseca; Oai-Implementers (E-mail) Assunto: Re: [OAI-implementers] set spect error You aren't returning any records for your set: http://www.ibict.br/cgi-bin/ibict/oai.pl?verb=ListIdentifiers&set=cience&met adataPrefix=oai_dc Only a resumptionToken. The clue is in "(header+,resumptionToken?)", which is regular expression speak for match header 1 or more times, match resumption token 0 or 1 times. You should be returning a noRecordsMatch error if there are no records that match the OAI query. To sum-up: Every OAI response should either be an error, a complete list, or a partial list (records+resumption token). All the best, Tim. ----- Original Message ----- From: "Ramon Martins Sodoma da Fonseca" To: "Oai-Implementers (E-mail)" Sent: Tuesday, April 22, 2003 5:52 PM Subject: [OAI-implementers] set spect error Dear fellow implementers.. we have been working on the protocol and doing the fine tuning (I hope) we had the setSpec work, but when I click on the set link on the OAI Explorer, which I believe shows all the records of that set, it shows me the following error " Parsed Output XML Schema Validation Error ! [Error] fileS0qUiQ:10:20: The content of element type "ListIdentifiers" must match "(header+,resumptionToken?)". /tmp/fileS0qUiQ: 1106 ms (5 elems, 8 attrs, 9 spaces, 77 chars) XML Output 2003-04-22T16:43:58Z http://www.ibict.br/cgi-bin/ibict/oai.pl cience!!!oai_dc!3 " this is how it shows when asking for the listSets verb: " List of Sets Click on the link to list the contents Ciência e Informação <> set description: dc: description: Teses e Dissertações Eletrônicas sobre ciência e ciência da informação Request : http://www.ibict.br/cgi-bin/ibict/oai.pl, verb=ListSets Response Date : 2003-04-22T16:41:01Z " this is the base url for testing: http://www.ibict.br/cgi-bin/ibict/oai.pl any input is welcome... thank you ............................................................................ ................................................. Ramón Martins S. da Fonseca Desenvolvimento Web IBICT - Instituto Brasileiro de Informação em Ciência e Tecnologia +55 61 217 6443 / 6347 ............................................................................ ................................................. _______________________________________________ OAI-implementers mailing list OAI-implementers@oaisrv.nsdl.cornell.edu http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers From simeon@cs.cornell.edu Wed Apr 23 16:54:34 2003 From: simeon@cs.cornell.edu (Simeon Warner) Date: Wed, 23 Apr 2003 11:54:34 -0400 (EDT) Subject: RES: [OAI-implementers] set spect error In-Reply-To: <3DB871E13177D61198850000E222F2EB247492@ebano2.ibict.br> Message-ID: It is legal to have a set which contains no items. It is just that (as Tim pointed out) the response in this case must be a noRecordsMatch error. -- Simeon On Wed, 23 Apr 2003, Ramon Martins Sodoma da Fonseca wrote: > I noticed that, but I really don´t know what could be wrong. > > My guess is that even though we implemented a table with the sets, and each > record has a set, they are not implemented correctly... > > thanks for everything.. > > ............................................................................ > ... > Ramón Martins S. da Fonseca > Desenvolvimento Web > IBICT - Instituto Brasileiro de Informação em Ciência e Tecnologia > +55 61 217 6443 / 6347 > ............................................................................ > ... > > > -----Mensagem original----- > De: Tim Brody [mailto:tim@tim.brody.btinternet.co.uk] > Enviada em: quarta-feira, 23 de abril de 2003 12:41 > Para: Ramon Martins Sodoma da Fonseca; Oai-Implementers (E-mail) > Assunto: Re: [OAI-implementers] set spect error > > > You aren't returning any records for your set: > http://www.ibict.br/cgi-bin/ibict/oai.pl?verb=ListIdentifiers&set=cience&met > adataPrefix=oai_dc > > Only a resumptionToken. The clue is in "(header+,resumptionToken?)", which > is regular expression speak for match header 1 or more times, match > resumption token 0 or 1 times. > > You should be returning a noRecordsMatch error if there are no records that > match the OAI query. > > To sum-up: Every OAI response should either be an error, a complete list, or > a partial list (records+resumption token). > > All the best, > Tim. > > ----- Original Message ----- > From: "Ramon Martins Sodoma da Fonseca" > To: "Oai-Implementers (E-mail)" > Sent: Tuesday, April 22, 2003 5:52 PM > Subject: [OAI-implementers] set spect error > > > Dear fellow implementers.. > > we have been working on the protocol and doing the fine tuning (I hope) > > we had the setSpec work, but when I click on the set link on the OAI > Explorer, which I believe shows all the records of that set, it shows me the > following error > > " > Parsed Output > XML Schema Validation Error ! > [Error] fileS0qUiQ:10:20: The content of element type "ListIdentifiers" must > match "(header+,resumptionToken?)". > /tmp/fileS0qUiQ: 1106 ms (5 elems, 8 attrs, 9 spaces, 77 chars) > > XML Output > > > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ > http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"> > > 2003-04-22T16:43:58Z > set="cience">http://www.ibict.br/cgi-bin/ibict/oai.pl > > > completeListSize="6">cience!!!oai_dc!3 > > > > > " > > this is how it shows when asking for the listSets verb: > " > List of Sets > Click on the link to list the contents > Ciência e Informação <> > > set description: > dc: > description: Teses e Dissertações Eletrônicas sobre ciência e ciência > da informação > > Request : http://www.ibict.br/cgi-bin/ibict/oai.pl, verb=ListSets > Response Date : 2003-04-22T16:41:01Z > " > > this is the base url for testing: > http://www.ibict.br/cgi-bin/ibict/oai.pl > > any input is welcome... > > thank you > ............................................................................ > ................................................. > Ramón Martins S. da Fonseca > Desenvolvimento Web > IBICT - Instituto Brasileiro de Informação em Ciência e Tecnologia > +55 61 217 6443 / 6347 > ............................................................................ > ................................................. > > _______________________________________________ > OAI-implementers mailing list > OAI-implementers@oaisrv.nsdl.cornell.edu > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > _______________________________________________ > OAI-implementers mailing list > List information, archives, preferences and to unsubscribe: > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > From jozef@nl.adlibsoft.com Tue Apr 29 12:46:23 2003 From: jozef@nl.adlibsoft.com (Jozef Kruger) Date: Tue, 29 Apr 2003 13:46:23 +0200 Subject: [OAI-implementers] Namespaces in elements Message-ID: <4E232B133AC9F04BB194C2AE2024EF9205C20C@saturnus.nl.adlibsoft.com> Hi everybody, I just implemented the output of my oai server by transforming our own xml format to for example dublin core (each supported output format having it's own .xsl stylesheet), however, in the transformed xml MSXML has replicated the dublin core namespace for each dc element: Walangara Where it was: Walangara My question is, is this a problem? Hussein's repository explorer does NOT complain about this, so that made me feel a little confident. Cheers, Jozef Kruger (Adlib Information Systems B.V. the Netherlands) From jozef@nl.adlibsoft.com Tue Apr 29 13:37:40 2003 From: jozef@nl.adlibsoft.com (Jozef Kruger) Date: Tue, 29 Apr 2003 14:37:40 +0200 Subject: [OAI-implementers] Namespaces in elements Message-ID: <4E232B133AC9F04BB194C2AE2024EF9280E496@saturnus.nl.adlibsoft.com> Hi, > It might be a problem. Some validating parsers (older xerces) > didn't pick up namespaces unless they were on the root element. > Send us your base URL and we'll check it out > with the Open Archives validator.Thanks, here's my url: http://demo.adlibsoft.com/oai-scripts/wwwopac.exe If it turns out to be a problem, does anyone know of a way to get around this? Cheers, Jozef Kruger From jyoung@oclc.org Tue Apr 29 14:33:59 2003 From: jyoung@oclc.org (Young,Jeff) Date: Tue, 29 Apr 2003 09:33:59 -0400 Subject: [OAI-implementers] Namespaces in elements Message-ID: Jozef, I've encountered the same problem. If I remember the incantation properly, you need to re-specify the dc namespace in the template as you are doing the transform. For example: start creating your OAI wrapper here close your OAI wrapper here etc. I think that was the trick, but my memory may be faulty. Jeff > -----Original Message----- > From: Jozef Kruger [mailto:jozef@nl.adlibsoft.com] > Sent: Tuesday, April 29, 2003 7:46 AM > To: oai-implementers@oaisrv.nsdl.cornell.edu > Subject: [OAI-implementers] Namespaces in elements > > > Hi everybody, > > I just implemented the output of my oai server by transforming our own > xml format to for example dublin core (each supported output format > having it's own .xsl stylesheet), however, in the transformed > xml MSXML > has replicated the dublin core namespace for each dc element: > xmlns:dc="http://purl.org/dc/elements/1.1/">Walangara > Where it was: > Walangara > > My question is, is this a problem? > Hussein's repository explorer does NOT complain about this, > so that made > me feel a little confident. > > Cheers, > Jozef Kruger (Adlib Information Systems B.V. the Netherlands) > _______________________________________________ > OAI-implementers mailing list > List information, archives, preferences and to unsubscribe: > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > From hussein@cs.uct.ac.za Tue Apr 29 15:12:03 2003 From: hussein@cs.uct.ac.za (Hussein Suleman) Date: Tue, 29 Apr 2003 16:12:03 +0200 Subject: [OAI-implementers] Namespaces in elements References: <4E232B133AC9F04BB194C2AE2024EF9205C20C@saturnus.nl.adlibsoft.com> Message-ID: <3EAE8833.5090508@cs.uct.ac.za> hi (disclaimer: i use xsltproc and not MSXML so this 'may' not work) after looking at Jeff's comments, i wonder if maybe you did not declare both the "oai_dc" (from OAI) and "dc" (from DC) namespaces. for example, when i write stylesheets to transform into dc, i usually start with: then the actual transform starts as follows: ... hope this is useful. if you want to see some similar stylesheets, there are a few like this included as examples in the XMLFile package (linked off the OAI tools page). ttfn, ----hussein Jozef Kruger wrote: > Hi everybody, > > I just implemented the output of my oai server by transforming our own > xml format to for example dublin core (each supported output format > having it's own .xsl stylesheet), however, in the transformed xml MSXML > has replicated the dublin core namespace for each dc element: > xmlns:dc="http://purl.org/dc/elements/1.1/">Walangara > Where it was: > Walangara > > My question is, is this a problem? > Hussein's repository explorer does NOT complain about this, so that made > me feel a little confident. > > Cheers, > Jozef Kruger (Adlib Information Systems B.V. the Netherlands) > _______________________________________________ > OAI-implementers mailing list > List information, archives, preferences and to unsubscribe: > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > -- ===================================================================== hussein suleman ~ hussein@cs.uct.ac.za ~ http://www.husseinsspace.com ===================================================================== From jozef@nl.adlibsoft.com Tue Apr 29 15:30:54 2003 From: jozef@nl.adlibsoft.com (Jozef Kruger) Date: Tue, 29 Apr 2003 16:30:54 +0200 Subject: [OAI-implementers] Namespaces in elements Message-ID: <4E232B133AC9F04BB194C2AE2024EF9280E49B@saturnus.nl.adlibsoft.com> Hello Hussein and others, > after looking at Jeff's comments, i wonder if maybe you did > not declare both the "oai_dc" (from OAI) and "dc" (from DC) namespaces. Thanks for your tip too, it doesn't work in MSXML, it prints all namespaces that are "active", so adding the oai_dc namespace to the top only made things worse (that namespace was also added to each element). Just to let you know :) cheers, Jozef Kruger From thabing@uiuc.edu Tue Apr 29 16:14:04 2003 From: thabing@uiuc.edu (Thomas G. Habing) Date: Tue, 29 Apr 2003 10:14:04 -0500 Subject: [OAI-implementers] Namespaces in elements In-Reply-To: <4E232B133AC9F04BB194C2AE2024EF9205C20C@saturnus.nl.adlibsoft.com> References: <4E232B133AC9F04BB194C2AE2024EF9205C20C@saturnus.nl.adlibsoft.com> Message-ID: <3EAE96BC.7020104@uiuc.edu> Jozef Kruger wrote: > Hi everybody, > > I just implemented the output of my oai server by transforming our own > xml format to for example dublin core (each supported output format > having it's own .xsl stylesheet), however, in the transformed xml MSXML > has replicated the dublin core namespace for each dc element: > xmlns:dc="http://purl.org/dc/elements/1.1/">Walangara > Where it was: > Walangara > > My question is, is this a problem? > Hussein's repository explorer does NOT complain about this, so that made > me feel a little confident. > > Cheers, > Jozef Kruger (Adlib Information Systems B.V. the Netherlands) > _______________________________________________ > OAI-implementers mailing list > List information, archives, preferences and to unsubscribe: > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > > Hi Jozef, Technically from an XML standpoint it is not a problem, but it is a terrible waste of bytes. I think I've run into this problem with MSXML previously also. If I remember correctly I solved it by using different ways to create the output elements. In XSLT you can create output elements either by inserting the element literally into the XSLT, such as the oai_dc:dc element below: Or, by using the the xsl:element tag, as shown: If I remember correctly, by mixing and matching which technique I used to generate the output elements, I was able to eliminate the extraneous namespace declarations. Try using the literal technique for the root element, but the technique for the child elements. If after some experimentation this doesn't seem to work either, I could go and maybe find some of my old stylesheets to see exactly what I did. Regards, Tom -- Thomas Habing Research Programmer, Digital Library Projects University of Illinois at Urbana-Champaign 155 Grainger Engineering Library Information Center, MC-274 thabing@uiuc.edu, (217) 244-4425 http://dli.grainger.uiuc.edu From simeon@cs.cornell.edu Wed Apr 30 00:04:13 2003 From: simeon@cs.cornell.edu (Simeon Warner) Date: Tue, 29 Apr 2003 19:04:13 -0400 (EDT) Subject: [OAI-implementers] OAI Metadata Harvesting Workshop at JCDL - online participant information Message-ID: (apologies for duplicate posting) I have started to collect participants' position statements and topics for discussion at the forthcoming JCDL workshop (31May2003). They are available from: http://www.cs.cornell.edu/people/simeon/workshops/JCDL2003/positions.html Cheers, Simeon. Workshop information: http://www.cs.cornell.edu/people/simeon/workshops/JCDL2003/index.html Workshop and conference registration: http://www.rice.edu/jcdl03/registration.html From lisrr@ukoln.ac.uk Tue Apr 15 13:27:08 2003 From: lisrr@ukoln.ac.uk (Rosemary Russell) Date: Tue, 15 Apr 2003 13:27:08 +0100 (BST) Subject: [OAI-implementers] OAI Resource (fwd) In-Reply-To: Message-ID: Since I'm not a list member, a colleague copied me the email below discussing metadata for people... I recently completed a review of metadata schemas for describing people, for the UK PORTAL project, which is available from: http://www.fair-portal.hull.ac.uk/deliverables.html It includes vCard (as mentioned by Caroline below) amongst others. Rosemary > ---------- Forwarded message ---------- > Date: Mon, 14 Apr 2003 17:31:25 -0400 (EDT) > From: Caroline Arms > To: 'Venugopal R Pally' > Cc: "Young,Jeff" , Simeon Warner , > oai-implementers@oaisrv.nsdl.cornell.edu > Subject: RE: [OAI-implementers] OAI Resource > > > Venu, > > I agree with the earlier respondents. OAI-PMH is a mechanism for > exchanging (but not searching) metadata. If your local application needs > to hold and support searching for information about people, that is likely > to be outside OAI-PMH entirely. However, if you are also looking to > exchange metadata about people among applications/services, you may be > able to use OAI-PMH. > > Useful metadata about people (whether you call them authors, agents, > parties, or whatever) is going to be different from useful metadata about > document-like information resources. Even though the DCMI now says that, > 'Here an information resource is defined to be "anything that has > identity".' the original elements (used for the OAI mandatory set) were > definitely developed for "document-like" objects. Squeezing information > about people into an unqualified Dublin Core record is unlikely to be > useful. > > As Jeff points out, since OAI-PMH allows you to use other metadata > formats, you can use it to exchange records that describe people if the > parties involved in the exchange can agree on a format. The mandatory DC > record can be minimal, its only useful purpose being as a conduit to a > "full" record in a more appropriate schema. > > Apart from MARC Name Authority Records in the marc21 "slim" schema, I am > not familiar with an XML Schema in common use for describing people. I > just found > http://www.numerata.com/vcardschema.htm > but vCard may not have the elements that are of interest in your > application. > > There are at least two more activities that I can think of that are > looking into records for people. However, neither has reached the stage > of having a schema, as far as I know. > > 1. DCMI Agents Working Group > http://www.dublincore.org/groups/agents/ > "Agents" include Creator/Contributor (and possibly Publisher) from the > primary DC Element Set. > > 2. InterParty > http://www.interparty.org/ > The InterParty project is funded under the European Commission's > Information Society Technologies Programme (IST), to design and specify a > network to support interoperability of party identification (for both > natural and corporate names) across different domains. InterParty builds > on the work of the project, one of whose deliverables was a > specification for a Directory of Parties > [http://www.indecs.org/pdf/DirectoryofParties.pdf]. InterParty is not > proposed as a replacement for existing schemes for the identification of > participants in the intellectual property domain (e.g. national library > name authority files or systems oriented towards the needs of rights > licensing) but as a means of effecting their interoperation. > http://www.interparty.org/ > > If you really are looking to exchange records about people, perhaps others > on the mailing list know of projects involving appropriate schemas or > element sets. > > Caroline Arms caar@loc.gov > Office of Strategic Initiatives > Library of Congress > == > Opinions expressed are my own. > == > > > On Mon, 14 Apr 2003, Young,Jeff wrote: > > > I don't see how titles deserve to be separate resources, but I can > > sympathize with your desire to store authors as resources. For example, I > > have an old copy of the LC Name Authority File available that is accessible > > via OAI GetRecord verbs (e.g. > > http://alcme.oclc.org/laf/servlet/OAIHandler?verb=GetRecord&metadataPrefix=m > > arcxml&identifier=oai:laf.oclc.org/LCCN/n78-95332). So, you can retrieve any > > record in the file by substituting the LCCN for that person at the end of > > the URL. > > > > The biggest problem with this from OAI's point of view is that you can't > > honestly represent these records in Dublin Core (e.g. > > http://alcme.oclc.org/laf/servlet/OAIHandler?verb=GetRecord&metadataPrefix=o > > ai_dc&identifier=oai:laf.oclc.org/LCCN/n78-95332). Is "William Shakespeare" > > the dc.creator? The dc.title? Dublin Core is a bibliographic metadata > > format, and people just aren't bibliographic items. On the other hand, I > > don't claim that this repository is OAI compliant. It's just a convenient > > way to make the MARC21 XML data available to both browsers and automated > > processes. > > > > If you're really intent on creating records for people, you might consider > > doing something similar. Then, in your research records, you can create > > links from the dc.creator/dc.contributor/dc.publisher, etc, to these records > > via the available URL. > > > > This brings up another problem, though. There is no place in the Dublin Core > > schema to put these URLs. For example, > > > > Shakespeare, William,--1564-1616 > > > > To get around this, the ETDMS format, for example, extends the Dublin Core > > schema to include a resource attribute. > > > > Shakespeare, William... > > > > If you store your research project records this way, you can always dumb > > them down to Dublin Core by omitting the URL. > > > > If you do decide to store records for people, I'd suggest that there's no > > good reason to mix them in with your research paper records. Also keep in > > mind that various groups are dealing with schemes that will associate people > > with URIs, so in the long term, you may want to pick a solution that will > > allow you to utilize these services when they become available. > > > > Jeff > > > > > -----Original Message----- > > > From: Venugopal R Pally [mailto:pally_reddy@yahoo.com] > > > Sent: Monday, April 14, 2003 2:35 PM > > > To: Simeon Warner; oai-implementers@oaisrv.nsdl.cornell.edu > > > Subject: RE: [OAI-implementers] OAI Resource > > > > > > > > > Thank you. As you said, Could you inform me how I can > > > provide this at the service layer ? I have already > > > implemented the OAI considering these research > > > projects as Resources. But it would be of good use to > > > my organization if I can extend it to considering > > > certain other things as Resources. My initial idea was > > > to use the same oai_dc metadataformat as schema for > > > all these resources except that I will use only some > > > of those elements in metadata of these different > > > resources. For example, I need creator element of > > > oai_dc for project but I dont need that element for > > > Author etc. This way I would omit certain elements for > > > these resources. Please suggest me if this is > > > practical. > > > Thanks, > > > Venu. > > > > > > --- Simeon Warner wrote: > > > > > > > > I agree with Jeff and feel that overloading the > > > > selective harvesting > > > > mechanisms (sets, metadata formats) with search > > > > functionality is not the > > > > best way to approach these issues. You should either > > > > use a protocol that > > > > supports remote search, or provide that > > > > functionality at the service layer > > > > (think of the OAI repository as one layer down). > > > > > > > > Cheers, > > > > Simeon. > > > > > > > > On Mon, 14 Apr 2003, Young,Jeff wrote: > > > > > I'd say the answer is no, you don't want to do > > > > that. OAI isn't a search > > > > > protocol, it's a simple harvesting protocol. If > > > > you really do need to search > > > > > your database by these fields you will need to use > > > > a different protocol such > > > > > a Z39.50 or SRU/SRW and use it to index those > > > > fields from your research > > > > > project records. Also keep in mind that the main > > > > reason people make your > > > > > metadata records available via OAI is so others > > > > (aka service providers) can > > > > > make them useful and searchable in this way. > > > > > > > > > > Basically, it sounds like you want more > > > > functionality than OAI alone > > > > > provides. Check out EPrints or DSpace if you need > > > > a more complete archiving > > > > > solution. > > > > > > > > > > Jeff > > > > > > > > > > > -----Original Message----- > > > > > > From: Venugopal R Pally > > > > [mailto:pally_reddy@yahoo.com] > > > > > > Sent: Monday, April 14, 2003 11:50 AM > > > > > > To: oai-implementers@oaisrv.nsdl.cornell.edu > > > > > > Subject: [OAI-implementers] OAI Resource > > > > > > > > > > > > > > > > > > Hi all, > > > > > > The OAI says that 'resource' is the object or > > > > stuff > > > > > > that metadata is about. So, can resources > > > > include > > > > > > multiple types ? For example, in our case, I > > > > > > identified research projects as resources. But > > > > later I > > > > > > found that harvestors would like to search our > > > > archive > > > > > > based on certain other things like Author, his > > > > Papers > > > > > > etc. This would mean I should consider Authors, > > > > Paper > > > > > > titles also as resources along with research > > > > projects. > > > > > > So, when a harvestor asks for ListIdentifiers, > > > > can I > > > > > > display all of these (Research Projects, > > > > Authors, > > > > > > Paper Titles) ? Or should I use different > > > > > > metadataPrefix for different resources ? > > > > > > Thanks, > > > > > > Venu. > > > > > > > > _______________________________________________ > OAI-implementers mailing list > OAI-implementers@oaisrv.nsdl.cornell.edu > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers > > Rosemary Russell UKOLN, University of Bath, Bath BA2 7AY Tel: +44 20 8318 5576 r.russell@ukoln.ac.uk http://www.ukoln.ac.uk