CDATA by any other name... (was The raw and the cooked)

david at david at
Tue Nov 3 12:01:31 GMT 1998

Rick Jelliffe writes:

 > A CDATA marked section is not only a way to prevent delimiter
 > recognition.  It is also a way to declare that the characters in
 > that section are limited to ones available in the direct document
 > encoding of the originating system.  (SGML has a CDATA keyword you
 > can use instead of content models: XML was felt not to need it
 > because you could use <![CDATA[, however that perhaps shows the
 > mind of the XML WG at that time, in that they were down-playing the
 > need for schemas.) It declares "this section does not use character
 > references or entities or subelements". So, conceptually, it could
 > sometimes be markup, not merely delimiter recognition.

While I agree that there are always interesting new uses for markup
constructions, I think that we're straining here.  My basic rule in
system design is to keep things as simple and obvious as possible; if
I wanted to signal to my application that an element contained only a
certain type of information (such as a limited character repetoire), I
would use an attribute that made that point clear, either a NOTATION
attribute or a simple CDATA attribute named something like

That said, I don't see the usefulness of limiting content to a
specific character repetoire arbitrarily; I *do* see the usefulness in
combination with an "xml:lang" or "mime-type" attribute, though.  An
intelligent editor could already act on xml:lang to limit character
selection, if such a thing were desirable.

All the best,


David Megginson                 david at

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list