[OAI-implementers] Special characters, UNICODE, and OAI tools

Hussein Suleman hussein@vt.edu
Mon, 12 Feb 2001 20:04:59 -0500


Caroline Arms wrote:
> Can you confirm that you are doing nothing to the UNICODE entities in your
> Raw XML view?  That's what it looks like if I look at the page source.

yes. the only substitution i was making for a raw view was escaping of
the angle brackets ... but i just changed that to escape ampersands as
well ... so now your output should be "rawer" ...

> What do you do in the parsed view?  I'm getting strange character
> combinations in both Netscape 4.7 and Internet Explorer 5.5.

thats the tricky thing ... all my character processing is done in 8-bit
so the XML parser mangles >8bit stuff before passing it on to other
tests ... (i use expat for XML parsing and xsv for xsd processing) ...

there are two things i could do here: either change everything to
16-bit, which could involve considerable effort ... or just ignore the
problem since it doesnt really signify an error: its just that the
"best-effort" to display characters is failing ... you should never
actually get an error message unless you use entities in your parameters
(identifiers, sets, etc.)

in either event, i will keep this issue in mind and try to remedy it
when i crank out the next revision ...

ttfn -h.

hussein suleman -- hussein@vt.edu -- vt cs --