[OAI-implementers] Re. Metadata Language Confusion

Michael L. Nelson mln@ils.unc.edu
Wed, 18 Jul 2001 10:50:36 -0400 (EDT)


On Fri, 13 Jul 2001, Jonathan Tregear wrote:

> Hussein,
> 
> Thanks for your response. The metadata itself is in all three
> languages, 

Jonathon:

sorry to jump in late in this thread...  one possible soution that I have
not seen listed (if I'm understanding the problem correctly) is to have
sets along metadata language type, and allow the harvester determine which
language it would like.

since it appears that each record manifests itself in 3 separate records,
that might be easier than trying to keep them all bundled together.

or... (off the top of my head) even if the records are stored all rolled
together...  you might still support set-based language dissemination, and
discard the unrequested languages in your responses.  if the harvester
doesn't want language X, it might appreciate the bandwidth savings...

regards,

Michael

>sorry if I wasn't clearer. I had not considered the "about"
> section of the record, I had been very focused on the metadata section
> and trying to come up with a solution that worked in there. The
> protocol documentation for the "about" section says that it is user
> defined and I wasn't aware of it's use for the dc language tag. That
> probably will work, but I am not sure how I would handle the normative
> version for distribution you refer to.
> 
> One of the goals of the repository is to enable discovery in any of
> the three languages, if I only disseminate one of the versions than I
> may as well not repose (is that the correct term?) the other two
> versions as far as searching in those languages is concerned.
> Secondly, the criteria for determining which version to disseminate
> might also be a problem. Your idea of using the version that
> corresponds to the language of the original document seems natural,
> but another criteria might just as easily be which language would
> provide the most coverage for the largest base of potential searchers.
> English being the predominate language of scientific discourse would
> probably mean disseminating the english version.
> 
> It also occurred to me that, assuming I markup the records in all
> three languages, I am essentially tripling the size of my metadata
> record for any article no matter what scheme I use. Other than the
> language of the metadata, the information in two of the records is
> essentially redundant. In a world of perfect machine translation the
> whole problem should probably be handled at the service provider
> level. Records would be disseminated in their original language and
> service providers would handle the translation of queries from any
> language into the language of the metadata record, as well as
> translating the returned records into the language of the searcher. I
> don't think I will be able to wait for that solution though.
> 
> 
> Finally, I had seen the "xml:lang" syntax in some of the Dublin Core
> RDF stuff I read, but I couldn't figure out how to use it in the
> context of OAI. I will take a look at your stuff tomorrow as soon as I
> get into work.
> 
> Thanks,
> 
> Jonathan Tregear
> Analyst, Health Sciences Center
> University of New Mexico
> jtregear@salud.unm.edu
> _______________________________________________
> OAI-implementers mailing list
> OAI-implementers@oaisrv.nsdl.cornell.edu
> http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers
> 

---
Michael L. Nelson			
207 Manning Hall, School of Information and Library Science
University of North Carolina 		mln@ils.unc.edu
Chapel Hill, NC 27599			http://ils.unc.edu/~mln/
+1 919 966 5042				+1 919 962 8071 (f)