[OAI-implementers] OAI-PMH & SOAP

Tim Brody tim@tim.brody.btinternet.co.uk
Mon, 4 Feb 2002 11:26:50 -0000


Hi,

(I agree with you that the current resumptionToken method is inelegant - it
is the most complex part to implement on the repository-side)

You're suggestion still doesn't protect the client, as the server could send
back as many records as it likes (if the server can send less than the max
records, what's to say a badly behaved server won't send back many more than
it?).

In the end it comes down to either ensuring the server is well-behaved (in
the current OAI sense, manually checking for resumptionTokens), or putting
some protection into the XML parsing/sax layer.

I would be interested to know how resumptionTokens can be avoided, as RT is
both flow-control (which can be replaced by start-maxrows requests), but
also state information (i.e. which records are to be returned). If different
sections of the same query are requested, without state being maintained,
surely there is a risk that some records may be missed in the overlap?
(or are you presuming that all records are added, and returned,
sequentially?)

All the best,
Tim Brody

----- Original Message -----
From: "Walter Underwood" <wunder@inktomi.com>
To: <oai-implementers@oaisrv.nsdl.cornell.edu>
Sent: Monday, February 04, 2002 4:21 AM
Subject: Re: [OAI-implementers] OAI-PMH & SOAP


> --On Sunday, February 3, 2002 7:20 PM -0500 Hussein Suleman
<hussein@vt.edu> wrote:
> >
> > [....] the client, on the other hand, can choose to terminate a
connection
> > if the server sends too much data (the client is therefore safe). in
fact
> > the client should always be safe by virtue of the fact that the protocol
> > is almost stateless and strictly client-initiated.
>
> First, "almost stateless" is the same as "stateful".
>
> Second, the guts of the XML parser or reader is a terrible place
> to implement a limit on the number of records a protocol response
> should return. And aborting the session is a poor way to limit the
> number of records. It turns a limit into an unrecoverable error.
>
> Layered protocol design is over twenty years old. This is pretty
> serious layer-crossing.
>
> I did think of this solution when I first brought this up, but
> that is not a way to get interoperability. It is a defense against
> malicious servers, which is important, but a different problem.
>
> Other protocols send a request with the start number and the
> number to return, and the reply has up to that number of records.
> Clear and stateless, with no worries about what to do when the
> resumption token times out because the browser comes back after
> an hour or a day. Or never comes back at all because they crashed
> or got bored. The resumption token implies that state will be
> kept when it is not needed, and it requires a periodic cleanup
> to get rid of abandoned state. Extra complexity which is really
> unnecessary.
>
> wunder
> --
> Walter R. Underwood
> Senior Staff Engineer
> Inktomi Enterprise Search
> http://search.inktomi.com/
> _______________________________________________
> OAI-implementers mailing list
> OAI-implementers@oaisrv.nsdl.cornell.edu
> http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers