XML 1.0 spec appears to violate itself

David Brownell david-b at pacbell.net
Wed Aug 25 03:39:06 BST 1999


Paul Grosso wrote:
> 
> It is true that the line Takuki Kamiya found is in error.
> That much is clear.

Agreement!

 
> At 12:01 1999 08 24 -0700, David Brownell wrote:
> >It's clear that it _is_ permitted to redeclare entities (including the
> >predeclared ones) in the internal subset.  See section 4.6 where it
> >talks about various "may" (may redeclare) and "must" (must still be WF,
> >and not change the standard effect) cases, none of which could work
> >if such redeclaration were disallowed.
> 
> But if anything else is clear, it's that this issue that David claims
> is clear isn't clear.

I'll disagree ... Paul, there are two issues you're conflating, and
we clearly _do_ agree about the one I was talking about:

  (1)	May the built-in entities be declared?

  (2)	If so, what constraints exist on their values?

My statement was about the former -- and I pointedly ignored the latter
issue.  Re the former, the XML spec says "For interoperability, valid
XML documents should _declare_ these entities".  (Emphasis added.)

Your statement is about the latter issue.  You're saying that folk using
the "<" syntax (or "&#x3c" etc) are under the influence of
the dark side of XML, and good Jedi XML coders use "&#60" instead.
For my thoughts on that issue, read on.


> I was the one who argued this before, and I don't want to reopen the
> discussion,

(But you most certainly did :-)


>	 but I continue to read the following text (quoted from 4.6
> aka http://www.w3.org/TR/REC-xml#sec-predefined-ent):
> 
>   If the [predefined] entities in question are declared,
>   they must be declared as internal entities whose replacement
>   text is the single character being escaped or a character reference
>   to that character, as shown below.
> 
> [which is then followed by a list of declarations for the 5 predefined
> entities] to say that it is not permissible to declare those 5 entity
> references any differently than as shown in that table.

There's a world of difference between that text, where "as shown below"
introduces an example where entity replacement values conform to that
constraint on replacement texts, and the text you appear to be reading:

	If the entities in question are declared, they must
	be declared as internal entities whose _literal entity
	value is_ as shown below.

There is no 'table"; the declarations use the same "<eg>...</eg>" element
which is used to introduce examples into the text.  In the DTD it's even
documented as being used for "illustrations".  (If it were a table, I'd be
more likely to grant that your interpretation is arguable.)

So:  Jedi XML coders need not worry about which of the many literal
entity values they use, so long as their replacement texts conform
to the requirement in the specification.

(However, good Jedi will certainly avoid a fight on this issue, since
the XML federation doesn't need such declarations.  It's the forces of
the SGML empire which need them.)


> Perhaps it is possible to read the "as" in "as shown below" to mean
> "in a vaguely similar manner that does not change the standard effect",
> but that is a somewhat non-standard meaning in my experience.  Nowhere
> in 4.6 does the word "redeclare" appear (as in "may redeclare"). 

Odd -- nowhere in my post does "in a vaguely similar manner" appear.   

The spec DOES use the word "declared" in 4.6, and in 4.2 says that
entities may be "declared" multiple times.  It didn't seem like an
unnatural leap to use the word "redeclared", but if it helps you,
please feel free to strike the "re".


>	 In
> fact, after a couple uses of "may" in the first paragraph unrelated to
> the predefined entities themselves, the word "may" does not appear in
> section 4.6.  It is at least reasonable to believe (as I do) that the
> "must be declared" in section 4.6 which I include in my quote above
> modifies, among other words, the "as shown below" phrase implying that
> it is an error to have a declaration for any of these 5 entities that
> differs from that shown in the table.

Keep in mind that the "must" is defined in section 1 to indicate "error"
which is then elaborated as "results are undefined".  Even if your
interpretation were generally agreed to be correct, then conformant XML
processors would still be completely free to ignore such "error" cases.

- Dave
 
p.s. This really made me wonder what the answer was to the question
	of how many angels can dance on the head of pin.  Altavista
	gave me an answer ... 10,381,120 answers, to be precise!

	However, I'm still not happy with that result, and I'm now
	engaged on what may be a fruitless Internet search for the
	answer to that question.  OK, so maybe I'll get back to that
	USB driver soon ... but surely the Internet should be able
	to answer such burning questions for us?  :-)

	A page with some more interesting questions is:

		http://www.nodus-1.com/13th_century.htm

	They include test results for the Buttered Cat Array.  


> 
> paul
>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list