Sean Mc Grath digitome at
Sat Jan 17 11:39:18 GMT 1998

I received this from David Durand who cannot currently post to
XML-DEV. I have Davids permission to forward it to the list.

[David Durand]
>In a nutshell: 16-bit character codes. Diaritics (accents, Vowel signs in
>Devanagari, etc.) represented (preferably) as combining characters, although
>some precombined characters are available for compatibility with old documents
>and software (political concession). For compatiblity with ISO's 32(31?) bit
>standard some escape sequences can include characters > 65537 (these are the
>"Surrogate" characters).
>You need the book for the details of handling bidirectional text rendering,
>word breaking, etc. etc. The scripts of the world cover a very wide space.
Sean Mc Grath

sean at
Digitome Electronic Publishing

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list