[OAI-general] Search Engine Coverage of the OAI-PMH Corpus

Frank McCown fmccown at cs.odu.edu
Wed Mar 8 12:16:09 EST 2006


We have just published an article that many of you may find interesting:

Frank McCown, Xiaoming Liu, Michael L. Nelson, and Mohammed Zubair. 
Search Engine Coverage of the OAI-PMH Corpus. IEEE Internet Computing, 
March/April 2006, Vol. 10, No. 2, pp. 66-73.

http://doi.ieeecomputersociety.org/10.1109/MIC.2006.41

You may access the technical report at

http://library.lanl.gov/cgi-bin/getfile?LA-UR-05-9158.pdf


Abstract:

Having indexed much of the "surface" Web, search engines are now using 
various approaches to index the "deep" Web. At the same time, 
institutional repositories and digital libraries are adopting the Open 
Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) to expose 
their holdings. The authors harvested nearly 10 million records from 
OAI-PMH repositories. From these records, they extracted 3.3 million 
unique resource URLs and then conducted searches on samples from this 
collection to determine how much of the OAI-PMH corpus the three major 
search engines have indexed.


-- 
Frank McCown
Old Dominion University
http://www.cs.odu.edu/~fmccown/



More information about the OAI-general mailing list