Character Encoding and the XML PR (was Re: PR.xml)

David Megginson ak117 at
Sat Jan 17 12:08:19 GMT 1998

James Clark writes:

 > Are you saying that Java's 16-bit characters prevent complete support
 > for some of those encodings in an XML parser?  If so, I don't see why,
 > since XML doesn't allow characters >= 0x110000, all legal XML characters
 > are representable in UTF-16 and hence in Java.

Quite right, I wasn't connecting the two -- Java supports UCS-4 only
to the extent allowed by surrogates in UTF-16, but that's the limit in
XML as well, so there should be no problem (at least, not until
Unicode starts assigning codes >= 0x110000, in which case the problem
will be both Java's and XML's).

All the best,


David Megginson                 ak117 at
Microstar Software Ltd.         dmeggins at

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list