Too much latitude in Nmtoken characters?

Rick Jelliffe ricko at allette.com.au
Mon Jul 14 07:34:11 BST 1997


> This seems to allow Nmtokens that aren't visible to the human eye,
> for example, consisting of a single zero width non-joiner.

I think XML name tokens are better detected by exclusion not inclusion:
this
is a sensible way when you have to deal with lots of potential naming 
characters.  In other words, you detect the end of the name by the 
presence of a sepchar or a delimiter, rather than by testing if each
character is a name character. At the reading end, such simple 
token-detection is all that is needed if your document is well formed.

To stop silly tags, the SGML declaration should have ZWNJ character
(which I think has to do with cursive operation of arabic scripts, 
and is as much required as accent characters) NAMECHAR not NAMESTRT.
So, in context, ZWNJ and RTL & LTR have visible effects.  They are not
usually undetectable.   But it is better to allow silly tags than 
disallow native-language markup: only about 1/4 of the world can make
sense of English/Latin tags. 

Apparantly the WG is waiting till August to finialise the naming
discipline.


Rick Jelliffe

xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa at ic.ac.uk)




More information about the Xml-dev mailing list