[OAI-implementers] pointies in abstracts

Simeon Warner simeon@lanl.gov
Wed, 4 Jul 2001 15:08:49 -0600 (MDT)


On Mon, 2 Jul 2001, Joe Futrelle wrote [excerpt]:
> The internal DTD subset strategy ought to work for any non-validating
> parser.  However the XML spec is unclear on whether non-validating
> parsers are expected to process externally-referenced DTD's.
>
> Just a quick example to illustrate the internal DTD subset strategy;
> suppose you need to use a copyright symbol, which in HTML is ©
> and in ISO-8859-1 is 169.  You could do it like this:
>
> <?xml version='1.0' encoding='ISO-8859-1'?>
> <!DOCTYPE myEntities [
>   <!ENTITY copy "&#169;">
> ]>
> <GetRecord ... etc ...

Note that the current OAI spec permits only encoding="UTF-8" (see section
3.1.2.1 Content-Type). The same could of course be done with UTF-8
encoding however.

On Mon, 2 Jul 2001, Hussein Suleman wrote [excerpt]:
> - XSV currently supports external entity references as follows:
> --- the external entity file must exist or the program crashes
> unceremoniously
> --- if the entity itself does not resolve its only a warning, not an
> error
> 
> bottom line: unless all users of your OAI interface have the full
> complement of popular entity files they may run into problems ...
> 
> suggestion: convert all named entities to unicode when generating OAI
> responses (thats what i do)

I strongly support this suggestion. We are, after all, trying to promote
interoperability and using Unicode (UTF-8) seems to be a positive step
in that direction.

Cheers,
Simeon.