Must XML be SGML compatible?

David Megginson ak117 at freenet.carleton.ca
Wed Oct 29 14:47:30 GMT 1997


Jarle Stabell writes:

 > [JS] I don't know the theory well enough to tell how the grammar
 > size influences the speed of the parsers buildt with (LA)LR[1]
 > tools myself, but I guess the typical XML parser won't be buildt by
 > such tools at all, because an XML parser probably is quite simple
 > to build "manually", and because one generally gets faster parsers
 > this way.  (Hopefully some of the implementors read this, they
 > could easily falsify my view)

In fact, the opposite is the case -- XML is designed so that it's
possible to to use tools like Lex/Bison and JavaCC for rapid
application development, while SGML is not (really).  As far as I
know, fast full-SGML parsers like SP have never used (LA)LR[1] tools
during development.

 > I would be *very* suprised if someone were able to write a general
 > SGML parser being as fast as the fastest XML parser (in f.i. 3
 > years time).

I'd be interested to hear the opinions of others on this -- what
features of full SGML make it impossible to build a fast parser (as
opposed to building a parser quickly)?  Certainly, short references
and omitted tags require a little extra computation, but I cannot
imagine the results showing up as a noticeable slow-down.

 > You also state "to everyone's annoyance". This is exactly what I
 > mean, is the SGML compatibility so much worth that we instead will
 > force upon perhaps millions of users in the next 10-20 years
 > syntactic design "flaws" which are well known to us today?

The limitation on look-ahead causes no problem in XML, since XML does
not allow tag omission or pernicious mixed content (the two places
where this annoyance shows up in full SGML).  In full SGML, the
restriction helps to ensure that parsers will be fast and that they
require a somewhat-predictable amount of memory.

 > <<<<
 > 1) Credibility: by tying itself to a well-established international
 >    standard (ISO 8879), XML can win over conservative users in
 >    important areas like financial services and EDI.
 > >>>> 

 > [JS] Yes. But I'm not old/wise enough to understand that doing some
 > minor syntactic "fixes" should scare those away as long as it will
 > be an international standard with the great ideas of SGML intact.

I am not an engineer, either (at least, I don't wear an iron ring made
out of bridge wreckage); you might find it interesting, though, to
read the comp.risks newsgroup for a few weeks to find out how
seemingly little, harmless changes can bring down big systems.

That is not to say that SGML and XML should not try to improve, but
rather, that the changes should be done carefully.  XML has already
anticipated some of the proposed changes for the next revision of
SGML.

 > <<<<
 > 2) Implementation: the XML standard will live and die partly based on
 >    the enthusiasm of early implementors; piggy-backing on SGML gives
 >    it a good, experienced implementor-base right from the start.  
 > >>>>

 > [JS] Yes. But to speak for one possible implementor (myself), I
 > would be much more enthusiastic about it if I believed it was as
 > well-designed as it could be.  I really believe in the "semantic"
 > beauty of SGML, the tree structure/groves, the separation between
 > document type and instance, DSSL/XSL etc and also the "general"
 > concrete syntax, but I also think the general user would be better
 > off if XML were *simplified* SGML, not only a well-defined
 > subset/fragment of it.

And this will, of course, continue to be a controversy, as more and
more people come to XML from areas other than SGML.  One of the
difficulties, though, is that it is rarely self-evident what
"well-designed" means -- one person's convenience feature can be
another's implementation nightmare.  The SGML-conformance requirement
has acted as a sort-of sanity check on XML feature changes.


All the best,


David

-- 
David Megginson                 ak117 at freenet.carleton.ca
Microstar Software Ltd.         dmeggins at microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list