[OAI-implementers] UTF-16 Metadata

Thomas G. Habing thabing@uiuc.edu
Fri, 20 Apr 2001 14:02:29 -0500


How does UTF-8 preclude internationalization?  UTF-8 should allow the full
repertoire of Unicode characters just like like UTF-16 does; only with
different bit patterns for each character.

Or were you referring to other encodings, such as the iso-8859-x family?  I
would think that allowing arbitrary encodings (other than UTF-8 or UTF-16)
would make interoperability very difficult.

Henry Stern wrote:
> I've run into a little stopping place with the OAI protocol.  Only the UTF-8
> character encoding is permitted which eliminates the possibility of
> internationalization.
> I propose that the protocol definition be slightly altered to allow for
> different encodings to be used.  Since the processing instruction in the
> header of the XML document specifies which character encoding is being used,
> this shouldn't be any trouble.
> Aside from that, it's completely backwards compatible with all of the
> existing repositories and it won't be a big deal for your average harvester.
> Let me know what you think.
> Henry Stern
