[OAI-implementers] Re. Metadata Language Confusion

Jonathan Tregear jtregear@salud.unm.edu
Fri, 13 Jul 2001 01:22:38 -0600


Thanks for your response. The metadata itself is in all three languages, sorry if I wasn't clearer. I had not considered the "about" section of the record, I had been very focused on the metadata section and trying to come up with a solution that worked in there. The protocol documentation for the "about" section says that it is user defined and I wasn't aware of it's use for the dc language tag. That probably will work, but I am not sure how I would handle the normative version for distribution you refer to. 

One of the goals of the repository is to enable discovery in any of the three languages, if I only disseminate one of the versions than I may as well not repose (is that the correct term?) the other two versions as far as searching in those languages is concerned. Secondly, the criteria for determining which version to disseminate might also be a problem. Your idea of using the version that corresponds to the language of the original document seems natural, but another criteria might just as easily be which language would provide the most coverage for the largest base of potential searchers. English being the predominate language of scientific discourse would probably mean disseminating the english version.

It also occurred to me that, assuming I markup the records in all three languages, I am essentially tripling the size of my metadata record for any article no matter what scheme I use. Other than the language of the metadata, the information in two of the records is essentially redundant. In a world of perfect machine translation the whole problem should probably be handled at the service provider level. Records would be disseminated in their original language and service providers would handle the translation of queries from any language into the language of the metadata record, as well as translating the returned records into the language of the searcher. I don't think I will be able to wait for that solution though.

Finally, I had seen the "xml:lang" syntax in some of the Dublin Core RDF stuff I read, but I couldn't figure out how to use it in the context of OAI. I will take a look at your stuff tomorrow as soon as I get into work.


Jonathan Tregear
Analyst, Health Sciences Center
University of New Mexico