A call for reason

rev-bob at gotc.com rev-bob at gotc.com
Tue Nov 30 18:16:55 GMT 1999


> >So as it stands, there will exist valid SML documents that are not well
> >formed XML and will therefore trigger a fatal error if given to an XML
> >parser.
> 
> I was talking about removing an error reporting requirement
> because I believe XML WG went overboard with the error
> reporting requirements which places a heavy burden on
> the performance of XML parsers.

In other words, the WG went overboard in specifying that when something is broken, 
the parser must say so?  I don't consider that overboard at all.

> IMHO, HTML crowd got burnt badly with HTML's forgivable parsers
> and over-reacted.  There are other ways to solve this sort of
> problems without resorting to high tariff on all.

I think the UA authors finally got sick of having to code bloated "forgiving" parsers 
because the HTML spec required a huge amount of tolerance, but I could be wrong.  
After all, this is touchy-feely 1999, where getting the right answer to 1+1 is not as 
important as how you FEEL about the answer you got.  Am I the only programmer left 
who *likes* getting syntax errors that stop your production cold, because they give you 
an error you can fix quickly instead of a problem you may not even notice until it's too 
late?

Way back in the Dark Ages of my computer education, my first real computer science 
teacher told me something very valuable.  There are three sorts of errors in 
programming, each more serious than the last.  The first is a syntax error, and I love 
getting those - because they tell you exactly where the problem is and what's wrong.  
The compiler is supposed to halt compilation on those, because the output is going to be 
garbage anyway.  ("It's broke, so get it fixed.")  The second type is a run-time error - all 
the syntax looks right, but something critical breaks when the program runs.  (See GPF, 
memory allocation errors, etc.)  Here again, you get an error message, so you at least 
know what the problem is.  If you're lucky (or thought ahead), you even get some 
information that helps you track it down.  The third and most serious error is the logic 
error - where the program runs smoothly, but the results are incorrect.  (For instance, 
running a photo of a tree through a rendering system and getting an antelope.)  The 
syntax is right, the resources are cool, but your algorithm is screwed up...and *that's* 
what you have to figure out on your own.

These stringent error-reporting requirements that you complain about are syntax errors, 
sometimes run-time errors - and the WG is quite correct in requiring the parser to kick 
nonconforming documents out instead of trying to figure out what they meant to say.  If 
it's broken at that basic a level, it doesn't need to be set loose - make the author fix the 
errors first.  (And if it's program-generated code, that's an even better reason for this 
requirement - the error will show up in a log, and that lets the software author know 
there's a problem.)

Syntax errors are easy to fix if reported, but they can be damned near impossible to find 
with a mushy parser that lets 'em slide through.  All a tolerant parser does is consume 
CPU cycles and encourage sloppy coding - and neither of those does anybody any good.  
(Well, okay, maybe Intel and AMD benefit...but that's about all.)  The way to fix sloppy 
code is not to tolerate it.  Sure, utilities like HTML Tidy which will fix existing 
documents are nice - but odds are, without a strong push towards intolerance, it never 
would have attained its current level of prominence.



 Rev. Robert L. Hood  | http://rev-bob.gotc.com/
  Get Off The Cross!  | http://www.gotc.com/

Download NeoPlanet at http://www.neoplanet.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list