[OAI-implementers] superscripts, subscripts, Greek alphabet

Hussein Suleman hussein@cs.uct.ac.za
Sat, 10 May 2003 00:24:57 +0200


hi Jody

i wonder if your data is being input/created or is coming from an 
established source. if the latter, when you refer to "subscripts and 
superscripts", i also think unicode but the explicit subscript reminds 
me of how MARC handles these things.

if you are in fact using MARC, the late Robert France from the VT camp 
did a study of the problems inherent in converting MARC to Unicode ... 
while his discussion is dated in OAI terms it is still relevant from a 
MARC perspective and there are even some code fragments and conversion 
tables at:
   http://www.dlib.vt.edu/projects/OAI/marcxml/marcxml.html

if you're starting fresh - use unicode ... there is now enough support 
for it in terms of tools and documentation (as François has listed) and 
it is OAI-friendly. for the greatest portability, i usually use 7-bit 
ascii, with numeric entities for the rest of the codes (given that my 
data is largely 7-bit ascii this doesnt adversly affect performance). 
this way i get full unicode capabilities without breaking tools that 
dont support utf-8/16 or dont support it fully. i have used this 
approach for databases, metadata harvesting, service providers and user 
interfaces and it appears to work reasonably well.

ttfn,
----hussein


deridder wrote:
> Umm, how are you folks dealing with these?  Anything
> standardized yet?  If so, please point me in the right
> direction.
>   thanks!
> 
>     --j.
> 
> 
> 
>   Jody DeRidder
>   IT Administrator II
>   Digital Library Center
>   648A John C. Hodges Library
>   University of Tennessee
>   Knoxville, TN 37996
> 
>   Phone: (865) 974-4796
>   Email: deridder@aztec.lib.utk.edu
> 
> _______________________________________________
> OAI-implementers mailing list
> List information, archives, preferences and to unsubscribe:
> http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers
> 


-- 
=====================================================================
hussein suleman ~ hussein@cs.uct.ac.za ~ http://www.husseinsspace.com
=====================================================================