ASCII control characters in XML

James K. Tauber jtauber at
Tue Apr 28 19:20:41 BST 1998

At 09:21 AM 4/28/98 -0700, Steve Harris wrote:
>Is it possible to transport UTF-8-encoded text that includes some
>characters in the byte range x0000-x001F (ASCII control characters)?
>These codes are valid within UTF-8 (via RFC2044), but the XML
>specification clearly says that these codes do not constitute 'valid

You can't transport them in a way that an XML processor would recognize them
as being in that range and part of a parsed entity.

You could use processing instructions; something like <?char x0007?> and
rely on the application to recover the characters.

Also, you could define appropriate unparsed entities and then refer to them
by attribute <char val="x0007"/> (having previously defined x0007 to be an
unparsed entity and val to be an entity attribute).


James Tauber / jtauber at
Perth, Western Australia
XML Pages:

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list