Escape mechanism using release character

Rick Jelliffe ricko at
Sat Jul 17 09:12:31 BST 1999

From: Richard Tobin <richard at>

>> Why is it that the well known escape mechanism of using a
>> release character (like '\') for escaping special characters
>> (eg. '<','&') not used in XML?
>Because XML is a subset of SGML which does not use such a mechanism.
>If XML had been a new system designed from scratch, it might well have
>been much simpler in many respects.  On the other hand, it would
>probably not have succeeded.

Actually, SGML does have such a mechanism: the Markup Suppress
Character. This could have been defined as "\" for XML.  I think I
remember Charles Goldfarb even raised this issue for XML during its

The reasons against it include these:
    1) it creates three kinds of delimiting: by CDATA sections, by
references, and by markup suppression. XML tried to remove duplication
unless there was a good reason;

    2) programmers have a lot of difficulty coping with delimiters
the appalling support for correct delimiters in first generation XML

    3) HTML and almost all SGML document s do not use this mechanism, so
you would be building in incompatability;

    4) it creates another character with a special meaning that must be
as well as & and <, parsers must look for / and people must delimit it
in text.

    5) the character "\" is problematic for Japanese in that the ASCII
code point
for that character is used for the Yen character in ShiftJIS:  if we
used that
character, then it would rule out the class of dumb applications that
just understand
the ASCII codepoints delimiter recognition and pass every other byte

    6) the character "\" is problematic in Taiwanese encodings, in that
it is used
as a codepoint as part of Big5 characters: if we used that character, it
would rule
out the class of dumb applications that just understand the ASCII code
values of
delimiters and pass everything else through (there is already a
potential for this
problem with [ and ] as used used in CDATA sections, but "\" would be
far worse).

    7) \ is often used in programming languages as an escape. As you
might know
from shell languages, double delimiting is really tricky, and if you
need to triple
delimit (e.g. use "\\\\" to represent "\\" to represent \ in output) it
gets complicated).
So it is common practise for markup languages to use different delimiter
than the delimiter delimiters of the embedded language; similarly it is
for XML processing languages to use different delimiter delimiters: e.g.
uses "%" no "\" or entities.

    8) Also, I think there is a good reason in that \ might encourage
the view that
XML documents are delimited merely to fit into a pipeline of processes:
adopted this approach for handling XML documents with CSS stylesheets in
which is why &amp;? gets treated like a processing instruction. But this
is wrong
behaviour; in XML data is not tailored to a process, you declare what
you want.
So if I say &amp;?  I do not want a processing instruction start at my
XSL gets this very right in its approach.

Rick Jelliffe

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as: and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list