Too much latitude in Nmtoken characters?

Mark Brissenden markb at glosa.com
Fri Jul 11 20:37:44 BST 1997


Eric Baatz - Sun Microsystems Labs BOS wrote:
> 
> I don't understand why such a wide choice of characters are allowed in
> in the June XML Syntax specification.
> 
>   [7] Nmtoken ::= (NameChar)+
> 
> Because
> 
>   [4] NameChar ::= Letter | Digit | MiscName
>   [3] MiscName ::= '.' | '-' | '_' | ':' | CombiningChar | Ignorable | Extender
>   [87] Ignorable ::= a whole bunch of characters like "zero width non-joiner",
>                      "right-to-left mark", and "zero width no-break space"
> 
> This seems to allow Nmtokens that aren't visible to the human eye,
> for example, consisting of a single zero width non-joiner.
> 

	Plus, what if one is embedded in a token?  You could have
two different tokens that appear to be identical to the human eye, 
one of which could split into two parts in the right context.


> My limited understanding of SGML suggests that a Nmtoken is more like
> a Name rather than a superset of it.  For that matter, is "9" a sensible
> Nmtoken?
> 

	I think "9" is a legitimate Nmtoken (in the Ref. Concrete
Syntax, anyway), because Nmtokens aren't limited in their first
character the way Names are - the first character can be anything
which can appear in the rest of the token.

-- 

================================
Mark Brissenden
Glosa International
http://www.csn.net/~brissen
================================



xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa at ic.ac.uk)




More information about the Xml-dev mailing list