CDATA by any other name... (was The raw and the cooked)

david at megginson.com david at megginson.com
Fri Oct 30 16:33:03 GMT 1998


Richard L. Goerwitz III writes:

 > Just for fun, try running this through our validator:
 > 
 >    http://www.stg.brown.edu/service/xmlvalid/
 > 
 > I'm sure you can spot the other (accidental) error, in addi-
 > tion to the one we've been discussing.

Yes, my gray-coloured organic validator caught that one when I saw it
quoted in Henry's post.

I agree with Richard that comments and PIs are also allowed in element
content (and I suspect that he's right about entity references as
well), but I disagree that CDATA sections are atomic: remember that
the XML 1.0 REC contains only syntactic productions, not a data model.
To help the discussion, here is the relevant clause from the XML 1.0
REC:

====================8<====================8<====================

2.7 CDATA Sections

CDATA sections may occur anywhere character data may occur; they are
used to escape blocks of text containing characters which would
otherwise be recognized as markup. CDATA sections begin with the
string "<![CDATA[" and end with the string "]]>":

 CDATA Sections
  [18] CDSect ::= CDStart CData CDEnd
  [19] CDStart ::= '<![CDATA['
  [20] CData ::= (Char* - (Char* ']]>' Char*)) 
  [21] CDEnd ::= ']]>'

Within a CDATA section, only the CDEnd string is recognized as markup,
so that left angle brackets and ampersands may occur in their literal
form; they need not (and cannot) be escaped using "&lt;" and
"&amp;". CDATA sections cannot nest.

An example of a CDATA section, in which "<greeting>" and "</greeting>"
are recognized as character data, not markup:

 <![CDATA[<greeting>Hello, world!</greeting>]]>

====================8<====================8<====================

This doesn't really give us enough to go on, but one could choose to
attach some weight to the statement in the opening paragraph that
CDATA sections are used to "escape blocks of text", and to the
statement that in the final example the contents of the section "are
recognized as character data".


All the best,


David

-- 
David Megginson                 david at megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list