CDATA by any other name... (was The raw and the cooked)

david at megginson.com david at megginson.com
Fri Oct 30 16:24:22 GMT 1998


Henry S. Thompson writes:

 > <david at megginson.com> writes:
 > 
 > > So, Henry's asking whether this is valid:
 > > 
 > >   <!DOCTYPE a [
 > >     <!ELEMENT a (b, c)>
 > >     <!ELEMENT b EMPTY>
 > >     <!ELEMENT c EMPTY>
 > >   ]>
 > >   <a><![CDATA[  ]><b/><c/></a>

And I'll answer my original posting and say that it's not valid
because it's not well-formed -- let's try

  <!DOCTYPE a [
    <!ELEMENT a (b, c)>
    <!ELEMENT b EMPTY>
    <!ELEMENT c EMPTY>
  ]>
  <a><![CDATA[  ]]><b/><c/></a>

instead, and continue the discussion from there.

 > What he said.  The DOM made a serious mistake here in my opinion:
 > it's stranded in no-person's-land between raw and cooked, without
 > being either.  It's not cooked, because it gives you
 > EntityReference and CDATA nodes.  It's not raw, because it DOESN'T
 > give you character entity references.

The DOM level-one core serves two constituencies -- authoring tools
that need to do horizontal transformations (XML=>XML, where the result
replaces the original) and processing/rendering tools that need to do
downstream processing (XML=>XML or XML=>X, where the original remains
unaltered).  Horizontal transformations will usually be somewhat
lossy, and the DOM WG has clearly decided that only a few lexical
features were important enough to give a good cost/benefit return on
the effort required to specify and implement them.

However, the point is that a specific DOM tree doesn't *have* to
include nodes for comments, CDATA sections, and entity references --
they are there only to support very specialised applications and
should be stripped out for ordinary XML processing.


All the best,


David

-- 
David Megginson                 david at megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list