ASCII control characters in XML

Steve Harris sharris at
Tue Apr 28 18:21:02 BST 1998

Is it possible to transport UTF-8-encoded text that includes some
characters in the byte range x0000-x001F (ASCII control characters)?
These codes are valid within UTF-8 (via RFC2044), but the XML
specification clearly says that these codes do not constitute 'valid
characters'. My application that wraps Clark's "expat" dies upon
encountering codes in this range, citing well-formedness violations. I'm
looking for the proper method for transporting text that occasionally
includes these codes.
I've been RTFM'ing this for a while now, and I've found plenty of
archived discussion regarding raw binary data as PCDATA content, but
this seems closer to common text-processing problem. Any advice or
further interpretation would be greatly appreciated.

Steven E. Harris
Software Engineer
1601 Fifth Avenue, Suite 1900
Seattle, Washington 98101
(206) 292-1001 x436

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list