[OAI-implementers] Questions about wrong output due to terrible input

Jozef Kruger jozef@nl.adlibsoft.com
Tue, 1 Apr 2003 16:46:48 +0200


This is a multi-part message in MIME format.

------=_NextPartTM-000-955999bb-b6a0-49c0-91e7-684ae93e84ee
Content-Type: multipart/alternative;
	boundary="----_=_NextPart_001_01C2F85D.8653E09E"

------_=_NextPart_001_01C2F85D.8653E09E
Content-Type: text/plain;
	charset="US-ASCII"
Content-Transfer-Encoding: quoted-printable

Hi everyone,
=20
just this week someone from my firm tested my OAI implementation and he
sent me a report with the results.
What he did was give my program some terrible input with things like
set=3Dnon_existing_made_up_set or from=3Dvery_illegal_date
The protocol isn't being very specific about the arguments that are
returned in the header (in the request node I mean).
What I did was just return the the arguments that mattered (for each
verb) the way they came in, resulting in invalid output in these cases.
=20
Should I check for each of these if they contain any illegal stuff? If
so and I would skip any illegal ones, the output wouldn't match with the
input anymore.
You could for example get error code=3D"noRecordsMatch" due to an =
illegal
date, but in the output you wouldn't see that date anymore.
=20
I think the solution to this kind of problem would be a check before
sending the request to the repository. But than again, you just might
still be left with illegal input.. so omitting those things in the
output looks like the only right solution.
=20
Any thoughts on these matters?
=20
cheers,
Jozef Kruger (Adlib Information Systems B.V. the Netherlands)

------_=_NextPart_001_01C2F85D.8653E09E
Content-Type: text/html;
	charset="US-ASCII"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE>Message</TITLE>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Dus-ascii">
<META content=3D"MSHTML 6.00.2800.1141" name=3DGENERATOR></HEAD>
<BODY>
<DIV><FONT face=3DArial size=3D2><SPAN class=3D450225613-01042003>Hi=20
everyone,</SPAN></FONT></DIV>
<DIV><FONT face=3DArial size=3D2><SPAN=20
class=3D450225613-01042003></SPAN></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2><SPAN class=3D450225613-01042003>just =
this week=20
someone from my firm tested my OAI implementation and he sent me a =
report with=20
the results.</SPAN></FONT></DIV>
<DIV><FONT face=3DArial size=3D2><SPAN class=3D450225613-01042003>What =
he did was give=20
my program some terrible input with things like =
set=3Dnon_existing_made_up_set or=20
from=3Dvery_illegal_date</SPAN></FONT></DIV>
<DIV><FONT face=3DArial size=3D2><SPAN class=3D450225613-01042003>The =
protocol isn't=20
being very specific about the arguments that are returned in the header =
(in the=20
request node I mean).</SPAN></FONT></DIV>
<DIV><FONT face=3DArial size=3D2><SPAN class=3D450225613-01042003>What I =
did was just=20
return the the arguments that mattered (for each verb) the way they came =
in,=20
resulting in invalid output in these cases.</SPAN></FONT></DIV>
<DIV><FONT face=3DArial size=3D2><SPAN=20
class=3D450225613-01042003></SPAN></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2><SPAN class=3D450225613-01042003>Should =
I check for=20
each of these if they contain any illegal stuff? If so and I would skip =
any=20
illegal ones, the output wouldn't match with the input=20
anymore.</SPAN></FONT></DIV>
<DIV><FONT face=3DArial size=3D2><SPAN class=3D450225613-01042003>You =
could for=20
example get <SPAN class=3Dt>error</SPAN><SPAN class=3Dt> =
code</SPAN><SPAN=20
class=3Dm>=3D"</SPAN><B>noRecordsMatch</B><SPAN class=3Dm>" due to an =
illegal date,=20
but in the output you wouldn't see that date =
anymore.</SPAN></SPAN></FONT></DIV>
<DIV><FONT face=3DArial size=3D2><SPAN class=3D450225613-01042003><SPAN=20
class=3Dm></SPAN></SPAN></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2><SPAN class=3D450225613-01042003><SPAN =
class=3Dm>I=20
think the solution to this kind of problem would be a check before =
sending the=20
request to the repository. But than again, you just might still be left =
with=20
illegal input.. so omitting those things in the output looks like the =
only right=20
solution.</SPAN></SPAN></FONT></DIV>
<DIV><FONT face=3DArial size=3D2><SPAN class=3D450225613-01042003><SPAN=20
class=3Dm></SPAN></SPAN></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2><SPAN class=3D450225613-01042003><SPAN =
class=3Dm>Any=20
thoughts on these matters?</SPAN></SPAN></FONT></DIV>
<DIV><FONT face=3DArial size=3D2><SPAN class=3D450225613-01042003><SPAN=20
class=3Dm></SPAN></SPAN></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2><SPAN class=3D450225613-01042003><SPAN=20
class=3Dm>cheers,</SPAN></SPAN></FONT></DIV>
<DIV><FONT face=3DArial size=3D2><SPAN class=3D450225613-01042003><SPAN =
class=3Dm>Jozef=20
Kruger (Adlib Information Systems B.V. the=20
Netherlands)</SPAN></SPAN></FONT></DIV></BODY></HTML>
=00
------_=_NextPart_001_01C2F85D.8653E09E--

------=_NextPartTM-000-955999bb-b6a0-49c0-91e7-684ae93e84ee--