[OAI-implementers] UTF-16 Metadata

Thomas G. Habing thabing@uiuc.edu
Fri, 20 Apr 2001 14:02:29 -0500


How does UTF-8 preclude internationalization?  UTF-8 should allow the full
repertoire of Unicode characters just like like UTF-16 does; only with
different bit patterns for each character.

Or were you referring to other encodings, such as the iso-8859-x family?  I
would think that allowing arbitrary encodings (other than UTF-8 or UTF-16)
would make interoperability very difficult.

Henry Stern wrote:
> I've run into a little stopping place with the OAI protocol.  Only the UTF-8
> character encoding is permitted which eliminates the possibility of
> internationalization.
> I propose that the protocol definition be slightly altered to allow for
> different encodings to be used.  Since the processing instruction in the
> header of the XML document specifies which character encoding is being used,
> this shouldn't be any trouble.
> Aside from that, it's completely backwards compatible with all of the
> existing repositories and it won't be a big deal for your average harvester.
> Let me know what you think.
> Henry Stern
> ---
> Flon's Law:
>         There is not now, and never will be, a language in
>         which it is the least bit difficult to write bad programs.
> _______________________________________________
> OAI-implementers mailing list
> OAI-implementers@oaisrv.nsdl.cornell.edu
> http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers

Thomas G. Habing
Research Programmer, Digital Library Initiative
University of Illinois at Urbana-Champaign
052 Grainger Engineering Library, MC-274
thabing@uiuc.edu, (217) 244-7809