Is validity an option?

Paul Prescod paul at prescod.net
Thu Apr 1 17:24:12 BST 1999


In the XSL-List, Ken Sall quoted Tim Bray:
> 
> http://www.xml.com/axml/testaxml.htm
> 
>         Tim Bray's annotations:
> 
> "Validity Is Not An Option
> 
> XML evangelists, such as myself, take great glee in pointing out that
> XML, unlike SGML, has no optional features; the result, we claim
> triumphantly, is that any XML processor in the world should be
> able to read any XML document in the world (well, modulo character
> encoding issues).
>
> "Aha!" claim some ungrateful doubting Thomases; "XML
> distinguishes well-formedness and validity, and that's an option!"
>
> Wrong. Anything that's well-formed is an XML document,
> and any XML processor has to be able to read any well-formed
> document. If a document wants to aspire to the higher karmic plane
> of validity, well good on it, but that's an extra, not an optional
> feature of XML."

First, validity is an optional feature of *parsers*. I believe this to be
self-evident. I have heard the "no optional features" statement
interpreted in three different ways -- obviously Tim chooses to interpret
it in a way that allows XML to have none.

Second, the XML specification is quite clear about the fact that different
XML processors can legally produce different parse trees for the same
data. Heck, they can produce a different parse tree depending on the day
of the month.

"The behavior of a validating XML processor is highly predictable; it must
read every piece of a document and report all well-formedness and validity
violations. Less is required of a non-validating processor; it need not
read any part of the document other than the document entity."

To be perfectly honest I am a lot more comfortable with the SGML-world's
model: some documents are not processable by some parsers but if the
parser says it can handle it then you always know what you are getting
out.

Perhaps it isn't too late -- maybe the information set group could fix
this flaw. After all, they are in the business of ensuring conformance of
processors so the next step would be to rigorously specify conformance
classes: "validating", "external entity fetching non-validating",
"non-external entity fetching non-validating."

Handling three classes (three optional features!) is a hassle but the
current situation is that the parser can decide what it wants to do about
external entities all by itself!
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"Other Operating Environments Will Have Trouble Keeping up with Linux's
Growth"
 - http://www.idc.com/Data/Software/content/SW033199PR.htm
   International Data Corporation bulletin

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list