First draft of proposed XML TC for Unicode 3.0 (unofficial)

John Cowan cowan at locke.ccil.org
Wed Sep 8 17:19:40 BST 1999


James Clark scripsit:

> > 0E33;THAI CHARACTER SARA AM
> 
> That's very strange.  SARA AM certainly needs to be allowed in names.

It's a funny story.  In earlier versions of Unicode, SARA AM was treated
as canonically equivalent to NIKHAHIT followed by SARA AA; that is,
Unicode-conformant processes were not supposed to distinguish between
them.  In the latest version, this equivalence has been downgraded to
a mere compatibility equivalence.  As a result, SARA AM has become
a "compatibility character" and as such disallowed by the Appendix B rules.

> > 0EB3;LAO VOWEL SIGN AM

The same story applies here: the VOWEL SIGN AM is now a compatibility
equivalent of NIGGAHITA followed by VOWEL SIGN AA.

In any event, whatever XML worked before has to work now, so I am merely
proposing a statement that THAI CHARACTER SARA AM and LAO VOWEL SIGN
AM *are* legal in XML names, despite their status as Unicode compatibility
characters.  In any case their legality in XML *text* is of course not affected.

If anyone thinks something is desperately broken here, please contact
unicode at unicode.org right away.

-- 
John Cowan                                   cowan at ccil.org
       I am a member of a civilization. --David Brin

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list