[OAI-implementers] New OAI crawl results in progress...
Sat, 2 Mar 2002 15:07:22 +1100
I have started up my new OAI crawler which does a few more sanity
checks on data formats etc, automatically disables crawls from
bad sites, and so on. The results page has moved (but is linked
from the old manually generated page). The new page is:
If you site has been disbled, its because my end things you did
something wrong. If you click on the link for your repository name,
it will take you to a log of status messages and errors.
At present, I regenerate the HTML pages using a script I run by
hand so the results might not be 100% up to date when you view
them. But they should be pretty accurate.
I am now doing things like correctly supporting aggregators
(sites where records coming back do not belong to that site).
For example, I think the 'anu' site returns 'caltechEERL'
records - even though I don't have any details about such
After the crawler is a bit more robust, I may move the whole database
outside our firewall so others can query it.
Alan Kent (mailto:firstname.lastname@example.org, http://www.mds.rmit.edu.au)
Postal: Multimedia Database Systems, RMIT, GPO Box 2476V, Melbourne 3001.
Where: RMIT MDS, Bld 91, Level 3, 110 Victoria St, Carlton 3053, VIC Australia.
Phone: +61 3 9925 4114 Reception: +61 3 9925 4099 Fax: +61 3 9925 4098