[OAI-implementers] TLDP document repository

Emma Jane Hogbin emmajane@xtrinsic.com
Fri, 23 Jan 2004 11:53:08 -0500


Hi,

I'm a volunteer with The Linux Documentation Project <www.tldp.org>. We
currently host in the range of 200 books and articles about Linux. Many of
which are used in classrooms as textbooks, and by system administrators,
regular people, etc. :) In some cases the LDP documents are actually the
official documentation for specific open source projects. The documents 
are stored in either DocBook XML, DocBook SGML or LinuxDoc (which is SGML).
The LDP publishes PS, text, HTML and PDF versions of the documents --
the source XML/SGML files are also available to anyone who would like them.

I think it would make sense for the LDP to submit its repository to the
OAI. The first step will be to get our meta-data in order (and make sure
all of the documents validate, which they /should/). The following is the
proposed list of elements (from the DocBook DTD) which will be required
for all publications:

- title
- authorgroup or author (or authorcorp for organizations)
- pubdate in the format of YYYY-MM-DD (ISO standard for dates)
- revhistory including at least one revision with: 
	<revision>
	<revnumber></revnumber>
	<date></date>
	<authorinitials></authorinitials>
	<revremark></revremark>
	</revision>
- legalnotice and/or license (License is REQUIRED and must be one of GDFL,
  Creative Commons, or LDP License)
- email where the author can be reached.
- abstract
- copyright notice
- acknowledgements (optional)
- other credits (optional)
- disclaimer (optional)

What other meta-data information would we need to provide? I'm assuming
most of this can be paired up to the DublinCore, but I'm not sure if I'm
missing any other requirements for the DublinCore. 

Also, would we have a harvester crawl the site, or would we provide a
single XML file with a summary of all the docs in the collection?

Thanks!
emma

-- 
Emma Jane Hogbin
[[ 416 417 2868 ][ www.xtrinsic.com ]]