[OAI-implementers] character vs entity references

Hussein Suleman hussein@cs.uct.ac.za
Tue, 04 Nov 2003 07:56:20 +0200


hi Todd

you are correct - a character reference is meant to be the numeric 
version rather than textual.

and the validation problem you have experienced is precisely why OAI 
requires the use of numeric references - so that your XML is 
self-contained and does not import any external entity definition files.

if all you have are Latin-1 entities, it isn't too difficult to convert 
them to numeric equivalents in a pre-processing stage. some of the 
templates on the OAI website already have support for this (like the 
VTOAI Perl package)

ttfn,
----hussein


Todd White wrote:

> in the OAI-PMH 2.0 document:
> http://www.openarchives.org/OAI/openarchivesprotocol.html
> 
> ...under "3.2. XML Response Format," it reads:
> 
> "Character references, rather than entity references, must be used."
> 
> i'm assuming that this means, for example, that the n-tilde (ene) should
> be expressed with  ñ  instead of  ñ  
> 
> is this correct?  is the first the character reference and the latter the
> entity reference?
> 
> i've been struggling as of late with the issue of character encodings (the
> verifier always chokes on records like the one we have with an n-tilde
> (ene) in the title).  i'm now assuming that i should just serve these
> records with the numeric entity code (as in ñ for n-tilde).
> 
> can anyone confirm this?
> 
> 
> Todd M. White 
> Systems Research Programmer
> 734.647.8649 (direct) ~~~ http://www.merit.edu/~tmwhite/
> 
> Merit Network, Inc.
> 4251 Plymouth Road, Suite 2000, Ann Arbor, MI  48105-2785
> 734.764.9430 (general) ~~~ 734.647.3745 (fax)
> http://www.merit.edu/
> 
> Avoid people who say they know the answer.
> Keep the company of people who are trying to understand the question.
> 
> _______________________________________________
> OAI-implementers mailing list
> List information, archives, preferences and to unsubscribe:
> http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers
> 

-- 
=====================================================================
hussein suleman ~ hussein@cs.uct.ac.za ~ http://www.husseinsspace.com
=====================================================================