Validating Entities (was Re: XML Torture Test: Parsers Fail)

Richard Goerwitz richard at
Thu Apr 8 01:52:22 BST 1999

Tim Bray wrote:

> I'm with David Megginson here - you really have to stand on one
> leg and not think of the word "rhinocerous" to see the XML spec as
> mandating the checking of unreferenced entities.

You're not going to see me arguing that you intended it to be read
the rhino way.  Nor are you going to see me claim that this was any-
body else's actual intent.

My contention is that if the spec actually says that entities may
only be checked by a validating parser if used, it does so in a way
that requires exegesis.  Someone coming at the spec fresh, with lit-
tle background in SGML, and without any foreknowledge of where you
are headed.

It may sound ludicrous, but I remember it taking me several readings
of the spec, a look at some rather complex XML documents, and some
preliminary implementation work, to realize that, for validating
parsers, unreferenced entities must be left unchecked.

If this is what's happened in IE, then yes, it's somewhat mysterious
how they could have gotten as far as they did without realizing there
were problems.  But although DM has given me reason to think the spec
is clearer than I have maintained, the fact is that I understand how
they might have initially gone astray.

> > (Incidentally, does it bother anyone else that you can have valid
> > documents that aren't well-formed?

It's probably not worth explaining what I meant to say here.


Richard Goerwitz
PGP key fingerprint:    C1 3E F4 23 7C 33 51 8D  3B 88 53 57 56 0D 38 A0
For more info (mail, phone, fax no.):  finger richard at

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as: and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list