These are excellent suggestions, and ones that I'm sure Xiaoming Liu can
easily add.

But since you're on the line, I have some questions for you ;-)

1.  Do you have an official or personal opinion that you can share about
OAI & spidering?  

2.  DP9 is great for spiders that don't know any better, but what are the
chances of "OAI-aware" spiders?  Or is that such a special case that its
not worth accounting for...

Specifically, I maintain http://naca.larc.nasa.gov/.  Spiders are
frequently churning around in the tens of thousands of possible pages
there.  Of course, this is a good substitute:


but even better would be a spider that knew to use:




On Thu, 24 Jan 2002, Walter Underwood wrote:

> As a spider engineer, I'd like to suggest an improvement to DP9.
> I'm sending this to the whole OAI list partly to introduce myself,
> and partly because it is an interesting omission in DP9.
> DP9 should use HTML metadata standards to present the Dublin Core
> metadata. Right now, it prettyprints the info, but that is not
> useful for a spider. 
> In addition to the pretty representation, the generated HTML should
> include meta tags for each DC element. I'd recommend also using
> native HTML/HTTP standards for a couple of the elements:
>    dc.title:Hamlet --> <title>Hamlet</title>
>    dc.language:en  --> <meta http-equiv="content-language" content="en">
> Our engine (Inktomi Enterprise Search) will use that metadata for
> the information presented in the results page. In addition, the
> engine can be configured to use DC.identifier as the URL which is
> presented with the results.
> Finally, if there are browsable index pages with links to the 
> generated GetRecord pages, those should probably include a
> noindex robots meta tag. Lists of URLs are usually not very
> useful search results. They are excellent roots (start pages)
> for spidering, though.
