First draft of proposed XML TC for Unicode 3.0 (unofficial)

John Cowan cowan at
Wed Sep 8 17:19:40 BST 1999

James Clark scripsit:

> That's very strange.  SARA AM certainly needs to be allowed in names.

It's a funny story.  In earlier versions of Unicode, SARA AM was treated
as canonically equivalent to NIKHAHIT followed by SARA AA; that is,
Unicode-conformant processes were not supposed to distinguish between
them.  In the latest version, this equivalence has been downgraded to
a mere compatibility equivalence.  As a result, SARA AM has become
a "compatibility character" and as such disallowed by the Appendix B rules.


The same story applies here: the VOWEL SIGN AM is now a compatibility
equivalent of NIGGAHITA followed by VOWEL SIGN AA.

In any event, whatever XML worked before has to work now, so I am merely
proposing a statement that THAI CHARACTER SARA AM and LAO VOWEL SIGN
AM *are* legal in XML names, despite their status as Unicode compatibility
characters.  In any case their legality in XML *text* is of course not affected.

If anyone thinks something is desperately broken here, please contact
unicode at right away.

John Cowan                                   cowan at
       I am a member of a civilization. --David Brin

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as: and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list