CDATA by any other name... (was The raw and the cooked)

Richard L. Goerwitz III richard at goon.stg.brown.edu
Fri Oct 30 16:13:21 GMT 1998


> So, Henry's asking whether this is valid:
> 
>   <!DOCTYPE a [
>     <!ELEMENT a (b, c)>
>     <!ELEMENT b EMPTY>
>     <!ELEMENT c EMPTY>
>   ]>
>   <a><![CDATA[  ]><b/><c/></a>

Just for fun, try running this through our validator:

   http://www.stg.brown.edu/service/xmlvalid/

I'm sure you can spot the other (accidental) error, in addi-
tion to the one we've been discussing.

As for the marked section:

By our interpretation, CDSects are atomic, and make up
document content.  They aren't the same thing as CharData:

   content ::= (element | CharData | Reference | CDSect | PI
               | Comment)*

By our interpretation, comments, processing instructions,
and whitespace are allowed in places where content is not,
e.g., after the DTD, but before (or after) the top-level
start/end tag of the document itself:

   Misc ::= Comment | PI |  S

By our interpretation also, comments, processing instructions,
and whitespace may separate elements in cases like the above.
Here's a brief rewrite that illustrates:

   <a><!-- comment --><b/><?pistuff processing instruction ?><c/> </a>

Entities (which act a bit like preprocessor directives) may
also take the place of comments, PIs, or whitespace, if they
map to whitespace or an empty string.

That at least is how we thought about things, and why we
implemented our validator the way we did.

Richard Goerwitz
Scholarly Technology Group

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list