Extender characters, Production 89 of XML 1.0
John Cowan
cowan at locke.ccil.org
Mon Jan 11 19:08:13 GMT 1999
Elliotte Rusty Harold wrote:
> In XML ["extender"]
> characters can be used anywhere a base character or ideographic
> character can be used.
This is not quite true, because extenders are not name-start characters
in either XML or Unicode.
> However I have been unable to find in the Unicode book or Web site any
> definition of what makes a character an extender. Can anyone clue me in on
> why some Unicode characters have the extender property while others don't?
> What's the logic behind this grouping of characters across languages?
Roughly (and unofficially) speaking, an extender is something that isn't
a letter or combining mark but often appears embedded in words.
For example, one may use L plus MIDDLE DOT as a compatibility equivalent
of L WITH MIDDLE DOT in writing Catalan, and we do not want a
Catalan name to break into two names at the MIDDLE DOT.
(The dot is used to distinguish two successive Ls, written with
a dot, from the unitary Catalan letter "ll", written without a dot.)
Extenders are enumerated (but not explained) in Section 5.14 of
the Unicode Standard.
--
John Cowan http://www.ccil.org/~cowan cowan at ccil.org
You tollerday donsk? N. You tolkatiff scowegian? Nn.
You spigotty anglease? Nnn. You phonio saxo? Nnnn.
Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5)
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list