UTF-8 vs UTF-16...? (Was: Feeling good about SML)

Tim Bray tbray at textuality.com
Wed Nov 17 17:28:38 GMT 1999


At 06:04 AM 11/17/99 -0500, James Tauber wrote:
>UTF-16, however, only gives access to the equivalent of Unicode with the
>surrogate extension mechanism, ie the first 17 planes of the UCS.

Right, a mere million extra characters in excess of what we're using now.

This feels like a pretty low-risk option to me.  

In terms of actual usability, there's effectively no difference between
UTF-16 and UTF-8.  UTF-16 seems to be an easier sell in Japan for reasons
that I've not fully understood.

What's really happening, near as I can tell, is that C programmers use
UTF-8 and Java programmers use UTF-16, regardless of where they're from.

Fortunately, this particular conversion is easy and requires no tables. -T.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list