CDATA by any other name... (was The raw and the cooked)
Tim Bray
tbray at textuality.com
Sun Nov 1 03:21:14 GMT 1998
At 10:36 AM 10/30/98 -0500, david at megginson.com wrote:
>So, Henry's asking whether this is valid:
>
> <!DOCTYPE a [
> <!ELEMENT a (b, c)>
> <!ELEMENT b EMPTY>
> <!ELEMENT c EMPTY>
> ]>
> <a><![CDATA[ ]]><b/><c/></a>
>
>I'd like to hear Tim Bray's opinion, unless I've missed it already in
>this thread (are you reading this, Tim, or alternatively, do you have
>an e-mail filter that looks for your name?).
Yes and no, respectively. I've been lurking, hoping that someone
would post something definitive.
The more I think about it, the more I think it's valid, because white
space between child elements is OK, and the fact that the white space
is in a CDATA section doesn't mean it's not white space. Chris
Lovett argued that it would be OK if the white space were in
an entity reference, which I think is a strongly linked problem (although
I couldn't follow Chris' reasoning about why MSXML thinks this the
CDATA section is invalid). Larval agrees with me, by the way, because
the CDATA recognizer does its work first and the validator only ever sees
white space.
However, the rule that applies is section 3., validity constraint
"Element Valid", list item 2, which I quote:
2. The declaration matches children and the sequence of child elements
belongs to the language generated by the regular expression in the
content model, with optional white space (characters matching the
nonterminal S) between each pair of child elements.
Of course, the interpolation "(characters matching the nonterminal S)"
could lead a pedant to claim that "<![CDATA[" doesn't match S, so there
you go. So if only on these grounds, David's claim that the spec should
be clarified in this respect has some merit.
I must say I'm somewhat pleased by the fact that there are voices
in xml-dev simultaneously saying "The spec needs to be rewritten in
clear, limpid, prose" singing counterpoint with others saying "The
spec needs to be rewritten in formal logic". On the same grounds that
if both the IRA and the Unionist paramilitaries are denouncing you,
you're probably doing the right thing. Actually, both factions have a
point. The writing in the spec could be better, and so could the
formalism. It was the best Michael and I, with the committee's support,
could do in the time available, and it's been enough to limp along
with so far.
I do *not* agree that XML won't come into its own until we bypass
all the syntax and think only in terms of abstract data structures.
Having watched this profession for 20 years ago, I have come to
believe that a truly interoperable API is very nearly an oxymoron;
but syntax is something we know how to interoperate with. Also I
just don't believe that there is One True data model for XML.
-Tim
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list