[OAI-implementers] valid character encoding

Simeon Warner simeon@cs.cornell.edu
Wed, 13 Aug 2003 10:47:48 -0400 (EDT)

On Wed, 13 Aug 2003, Todd White wrote:
> is there a limited number of valid character encodings for a valid OAI
> repository?

You must use UTF-8, see:
> the encoding i am using is "ISO-8859-1"  this is to support some special
> characters in our metadata that were not supported by UTF-8.

I believe all of ISO-8859-1 (Latin 1) is supported in Unicode with code
positions unchanged. The bytes will, of course, be different in a UTF-8
encoded stream.

Note that Microsoft's CP1252 uses codes 0x80--0xBF which aren't in Latin 1
and do require translation to different Unicode code positions, see:

> when i tested our newly developed OAI respository software using the
> web-based Open Archives Initiative - Repository Explorer
> (http://oai.dlib.vt.edu/cgi-bin/Explorer/oai2.0/testoai) it told me...
>   XML Schema Validation Error !
>   Illegal character encoding in XML
> here's the URL to our repository:
>   http://michiganteacher.net/oai
> _______________________________________________
> OAI-implementers mailing list
> List information, archives, preferences and to unsubscribe:
> http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers