[UPS] RE: Further issues with Dienst software and Open Archives

Carl Lagoze lagoze@cs.cornell.edu
Tue, 23 May 2000 05:53:48 -0400


Hi Rob,  I believe David Fielding is looking into this?  Let me know if not.

Looks like we'll have some interesting talks in San Antonio next week.

Regards,

Carl

> -----Original Message-----
> From: Robert Tansley [mailto:rht96r@ecs.soton.ac.uk]
> Sent: Friday, May 19, 2000 11:05 AM
> To: lagoze@CS.Cornell.EDU; help@ncstrl.org
> Cc: OA discussion list
> Subject: Further issues with Dienst software and Open Archives
> 
> 
> Hi, an even more technical mail this time. (I hope these are 
> appropriate
> addresses to send this to; if not, please point me in the right
> direction.)
> 
> Working on adding Open Archives support to CogPrints, I've 
> discovered an
> issue with the latest version of CGI and the dienst code. As 
> of version
> 2.64, the CGI.pm module by default uses "new style" URLs, in which the
> keyword/value query string is passed to the application delimited by
> semi-colons ";" instead of ampersands ";". E.g. if I send a request:
> 
> Dienst/Repository/4.0/List-Contents?file-after=2000-01-01&meta
> -format=oams
> 
> The dienst software, from CGI::query_string(), receives the 
> arguments as:
> 
> file-after=2000-01-01;meta-format=oams
> 
> so later on parsing this string produces duff results. (IMO 
> it's a rather
> dodgy practice for the CGI.pm team to change default 
> behaviour like this
> in a .01 revision.) This particular problem can be worked round by
> changing the line where CGI.pm is included in 
> dienst_src/Main/dienst.pl
> from:
> 
> use CGI qw(:standard);
> 
> to:
> 
> use CGI qw(:standard -oldstyle_urls);
> 
> However, this introduces another problem when using 
> partitionspecs. If I
> send a request like:
> 
> Dienst/Repository/4.0/List-Contents
>      ?partitionspec=physics;hep&file-after=2000-01-01
> 
> CGI.pm gives now the query string to dienst as:
> 
> partitionspec=physics&hep&file-after=2000-01-01
> 
> so it's changing the semicolon in the partitionspec into an &. I tried
> URL-encoding the ; (which sounds like good practice anyway) but CGI
> doesn't decode the ; in this case, so dienst gets:
> 
> partitionspec=physics%3Bhep&file-after=2000-01-01
> 
> I can quite easily fix it so the CogPrints code can decode the string,
> but with interoperability it takes two to tango; anything 
> making a Dienst
> request to CogPrints will have to know to encode the ';'. In 
> the Dienst
> protocol specification (either
> http://www.cs.cornell.edu/cdlrg/dienst/protocols/DienstProtocol.htm or
> http://www.cs.cornell.edu/cdlrg/dienst/protocols/OpenArchivesD
ienst.htm),
the example List-Contents request doesn't seem to have an encoded ';',
even though in earlier in the document ';' is listed as a character that
requires encoding. So what is the policy on this? Should the ';' be
encoded, in which case the specification document needs to be amended to
reflect this, or should it be left unencoded, in which case the dienst
code needs changing if it is to work with recent versions of CGI?

I also note that in the examples of both Dienst protocol specification
documents, the disseminate verb:

Dienst/Repository/1.0/Disseminate/handlecorp/970101/%23oams/xml

doesn't require the encoding of the / in the full ID "handlecorp/970101",
but does the # ("%23oams"). (I even came across the big kludge in the
dienst code to handle this case!) Requiring some special characters to be
encoded but others to be left unencoded seems to be an inconsistency in
the protocol that needs clearing up.

R

-- 
 Robert Tansley                    Tel: +44 (0) 23 80594492
 Multimedia Research Group         Fax: +44 (0) 23 80592865
 Electronics & Computer Science    http://www.ecs.soton.ac.uk/~rht96r/
 University of Southampton
 Southampton SO17 1BJ, UK