Character encodings

Chris von See cvonsee at
Mon Dec 7 20:51:52 GMT 1998


I am relatively new to XML and am trying to develop a program that can
generate XML in various encodings.  In section 4.3.3, the XML spec implies
that support of ISO 10646 UCS-2 encoding (i.e. Unicode) is valid, but in
the section on autodetection of encodings (Appendix F) there's no mention
of how to detect UCS-2 encoding.  I would *assume* that UCS-2 would start
with \x00 \x3c\ x00 \x3f ("<?") - is that right?  If so, then is the spec
wrong in not including this in Appendix F as valid?  Is it reasonable to
expect that many people will use UCS-2 because of its similarity to Unicode?

Chris von See
TechAdapt, Inc.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list