ASCII control characters in XML

James K. Tauber jtauber at jtauber.com
Tue Apr 28 19:20:41 BST 1998


At 09:21 AM 4/28/98 -0700, Steve Harris wrote:
>Is it possible to transport UTF-8-encoded text that includes some
>characters in the byte range x0000-x001F (ASCII control characters)?
>These codes are valid within UTF-8 (via RFC2044), but the XML
>specification clearly says that these codes do not constitute 'valid
>characters'.

You can't transport them in a way that an XML processor would recognize them
as being in that range and part of a parsed entity.

You could use processing instructions; something like <?char x0007?> and
rely on the application to recover the characters.

Also, you could define appropriate unparsed entities and then refer to them
by attribute <char val="x0007"/> (having previously defined x0007 to be an
unparsed entity and val to be an entity attribute).

James

--
James Tauber / jtauber at jtauber.com
Perth, Western Australia
XML Pages: http://www.jtauber.com/xml/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list