[OAI-implementers] OAI-PMH & SOAP

zubair@cs.odu.edu zubair@cs.odu.edu
Sun, 3 Feb 2002 10:16:11 -0500

I must admit that some of the points Walter is raising make sense to me
including the points he is making on the process. I am also excited to
learn that there is some interest in commercial sector regarding OAI, which
adds more credibility to our mission.    My personal opinion is that in
future (I hope there is one !) we should have some representation from the
Industry (people like  Walter) in the technical committee and the working
of technical committee should be more open.


                    Walter Underwood                                                                                                  
                    <wunder@inktomi.com>                        To:     "'oai-implementers@oaisrv.nsdl.cornell.edu '"                 
                    Sent by:                                    <oai-implementers@oaisrv.nsdl.cornell.edu>                            
                    oai-implementers-admin@oaisrv.nsdl.c        cc:                                                                   
                    ornell.edu                                  Subject:     Re: [OAI-implementers] OAI-PMH & SOAP                    
                    02/02/2002 07:34 PM                                                                                               

--On Saturday, February 2, 2002 4:45 PM -0500 Hussein Suleman
<hussein@vt.edu> wrote:
> - OAI-PMH is not for everyone ... if we generalize it to serve the needs
> of every community it will not be as useful for the purposes for which
> it was intended originally (namely, high-quality metadata transfer among
> digital library systems)

I want that library information to be easily available to all,
not just people willing to run a library-only protocol. I'm not
trying to make libraries different, or change the OAI goal.

With a SOAP protocol, any scripted web page can make a call to OAI.
Servers like DP9 and the repository explorer become very easy to
write. A professor's list of publications could be built from
the eprint data.

Many of our customers are libraries, or have libraries of valuable
docs. Pharaceutical and financial companies would love to have a
protocol like this. Customers regularly ask us how to deal with
metadata stored separately from documents.

> - the primary users of OAI will not be "harvesters" (in the crawler
> ... OAI is specifically NOT trying to create a better Google ... OAI-PMH
> is aimed at high-quality metadata transfer among managed digital

Well, Inktomi is a better Google, but that is a different issue.

PMH seems aimed at batch transfers between WAIS/Harvest style systems.
Modern spiders stopped doing that five years ago. We know a lot more
now. Modern spiders do incremental fetches, adaptive revisits, duplicate
detection, authentication, session cookies, etc.

We can share that experience.

For example, the current approach to lists (ListRecords) allows
a big server to accidentally mount a denial of service attack on a
client. All it has to do is return 1 million records of 1Kb each, and
watch the client die. That is bad.

In a safe list protocol, the client requests a number of results,
and the server is allowed to return fewer. That way, both sides are

> ... i cannot say "aye" or "nay" to SOAP until i have tested it and i
> think it is reasonable to expect the same of everyone else.

Or maybe not reasonable. The Aye's include: Microsoft, IBM, Sun, Apple,
Oracle, HP, Compaq, SAP, IONA, and so on. The Apache project has two
free implementations.

OAI is already using an XML RPC. Switching from a non-standard XML
RPC to a standard one should be an obvious decision.

Frankly, the only drawback to SOAP is that the interface definition
language, WSDL, is really ugly.

But go ahead and read the SOAP spec. It is rather clear and short
as these things go:


Walter R. Underwood
Senior Staff Engineer
Inktomi Enterprise Search
OAI-implementers mailing list