[OAI-implementers] RE: [Dspace-tech] OAI validation - withdrawn items, new dc types

Tansley, Robert robert.tansley at hp.com
Tue Nov 8 20:06:04 EST 2005


Hi Liam,

Actually it's a weird corner case where it's not clear to me what the
'right' thing to do is.  It could be that the repository explorer is
wrong, or DSpace.

Basically
http://demo.openrepository.com/demo-oai/request?verb=ListIdentifiers&met
adataPrefix=oai_dc asks for all records which have oai_dc metadata.  You
*have* to ask for a particular metadata prefix with ListIdentifiers.

But when you actually ask for the oai_dc metadata on a deleted item,
DSpace reports there is no oai_dc metadata for that item.  Because there
isn't.  Because it's been deleted.

However, if DSpace simply didn't include <record status="deleted"> for
that item in the ListIdentifiers response, harvesters would never know
the record had actually been deleted.

DSpace could simply claim to have oai_dc metadata for deleted items in a
ListMetadataFormats response, but this doesn't seem right.

Alternatively, it could give an 'idDoesNotExist' error code, but this
doesn't feel right either.  ListMetadataFormats doesn't provide any
means to indicate something like status="deleted".

I've CC'd this to oai-implementers, in case anyone there can give us
some pointers -- what's the appropriate behaviour?

 Robert TANSLEY / HP Labs / MIT Visiting Researcher
 http://www.hpl.hp.com/personal/Robert_Tansley/

> -----Original Message-----
> From: dspace-tech-admin at lists.sourceforge.net 
> [mailto:dspace-tech-admin at lists.sourceforge.net] On Behalf Of 
> Liam Lynch
> Sent: 03 November 2005 11:26
> To: Dspace-Tech (E-mail)
> Subject: [Dspace-tech] OAI validation - withdrawn items, new dc types
> 
> Hi all -
> 
> Just testing out our OAI-PMH capabilities on a demo 
> repository using the OAI repository explorer (i.e. using this 
> http://re.cs.uct.ac.za/ ) and I have a couple of validation 
> errors. I've searched back in the lists and haven't found 
> anything obvious about these issues, so wondering if anyone 
> can help ....
> 
> One problem seems to relate to this feature of DSpace -
> 
> "DSpace's OAI service does support the exposing of deletion 
> information for withdrawn items, but not for items that are 
> 'expunged' ( see above 
> <http://www.dspace.org/technology/system-docs/functional.html#
> deletions>). "
> 
> If you look at this xml from a ListIdentifiers request, 
> you'll notice how the first item is deleted (i.e. withdrawn) -
> http://demo.openrepository.com/demo-oai/request?verb=ListIdent
ifiers&metadataPrefix=oai_dc
> 
> i.e. this bit -
> 
> <header status="deleted">
> <identifier>oai:demo.openrepository.com:123456789/9</identifier>
> <datestamp>2005-05-19T10:38:31Z</datestamp>
> </header>
> 
> The OAI explorer test uses this particular item for a 
> ListMetadataFormats request - and it doesn't like what it gets back -
> 
> (22) Testing : ListMetadataFormats (identifier)
> URL : 
> http://demo.openrepository.com/demo-oai/request?verb=ListMetad
ataFormats&identifier=oai:demo.openrepository.com:123456789/9
> ------ Start of XML Response ------
> <?xml version="1.0" encoding="UTF-8" ?><OAI-PMH 
> xmlns="http://www.openarchives.org/OAI/2.0/" 
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
> xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ 
> http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"><responseDate
>2005-11-03T14:28:13Z</responseDate><request >
identifier="oai:demo.openrepository.com:123456789/9" 
> verb="ListMetadataFormats">http://demo.openrepository.com/demo
> -oai/request</request><error code="noMetadataFormats">There 
> are no metadata formats available for the specified 
> item.</error></OAI-PMH>
> ------- End of XML Response -------
> Test Result : FAIL!
> **** [ERROR] Error tag found but not expected : noMetadataFormats
> 
> 
> So what do you reckon - it doesn't understand what 
> status="deleted" means, and that if it did it wouldn't try 
> this test? Is this a problem with the xml DSpace produces or 
> more with the OAI explorer testing utility?
> 
> 
> The next problem relates to these new DC types we've added - 
> basically, it doesn't like 'em (see below). And no there's 
> reason why it should, I guess - it's got it's own schema, it 
> knows what it likes. So as we've customised our repository to 
> add some dc types, we need to change the OAIDCCrosswalk class 
> to not put these in the oai_dc XML, right?  That's fine, but 
> I guess it would be better if only dc fields that would be 
> accepted by the OAI_DC schema are actually put in generally - 
> i.e. if it only puts in elements that are in [title, creator, 
> subject, description, publisher, contributor, date, type, 
> format, identifier, source, relation, coverage, rights]. If 
> so no extra effort would be needed when new dc types are 
> added.  Would it make sense to change OAIDCCrosswalk to do this?
> 
> Any thoughts much appreciated.
> 
> Cheers,
> Liam
> 
> validation message -
> 
> (41) Testing : GetRecord (identifier, oai_dc)
> URL : 
> http://demo.openrepository.com/demo-oai/request?verb=GetRecord
&identifier=oai:demo.openrepository.com:123456789/224&metadataPrefix=oai
_dc
> ------ Response from Xerces Schema Validation ------
> [Error] re.0NiDLV:1:3495: cvc-complex-type.2.4.a: Invalid 
> content was found starting with element 'dc:entrez'. One of 
> '{"http://purl.org/dc/elements/1.1/":title, 
> "http://purl.org/dc/elements/1.1/":creator, 
> "http://purl.org/dc/elements/1.1/":subject, 
> "http://purl.org/dc/elements/1.1/":description, 
> "http://purl.org/dc/elements/1.1/":publisher, 
> "http://purl.org/dc/elements/1.1/":contributor, 
> "http://purl.org/dc/elements/1.1/":date, 
> "http://purl.org/dc/elements/1.1/":type, 
> "http://purl.org/dc/elements/1.1/":format, 
> "http://purl.org/dc/elements/1.1/":identifier, 
> "http://purl.org/dc/elements/1.1/":source, 
> "http://purl.org/dc/elements/1.1/":language, 
> "http://purl.org/dc/elements/1.1/":relation, 
> "http://purl.org/dc/elements/1.1/":coverage, 
> "http://purl.org/dc/elements/1.1/":rights}' is expected.
> /tmp/re.0NiDLV: 777;11;0 ms (35 elems, 10 attrs, 0 spaces, 2388 chars)
> ------- End of Xerces Schema Validation Report  -------
> 
> 
> 
> 
> This email has been scanned by Postini.
> For more information please visit http://www.postini.com
> 
> 
> 
> 
> -------------------------------------------------------
> SF.Net email is sponsored by:
> Tame your development challenges with Apache's Geronimo App 
> Server. Download
> it for free - -and be entered to win a 42" plasma tv or your very own
> Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
> _______________________________________________
> DSpace-tech mailing list
> DSpace-tech at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
> 



More information about the OAI-implementers mailing list