Simon St.Laurent SimonStL at
Wed Nov 26 13:35:47 GMT 1997

XML is the best opportunity I've yet seen to create a standard which handles 
documents (and other data) intelligently yet simply. Whatever XML's roots 
(which of course are SGML), XML has the opportunity to reach an extremely 
broad audience - an audience the size of the current (and future) HTML 
audience, not just the established SGML community.  

The terms of the XML discussion have always been framed in SGML, and are 
likely to continue to be for a considerable time to come.  While that has 
advantages, I don't think the concept of using XML as a Trojan Horse to 
introduce SGML proper to a larger audience is a good one.  I gave a seminar 
two weeks ago in Washington DC to the ACM - a place and an organization that I 
would tend to think of as friendly to SGML.  Of 50 people in the seminar 
(which was on Dynamic HTML), 15 had worked with SGML. Every time I brought up 
SGML (in connection with XML, CSS, and the DOM), I was greeted with questions 
about "is that really necessary?" "Are those SGML people trying to change 
_our_ world?"  These questions didn't just come from the HTML beginners; many 
of them came from the developers who had worked with SGML, some quite 
extensively.  At lunch the discussion quickly turned to XML, and I had to do a 
lot of convincing to get people 'past' SGML.

For public relations reasons, it seems like XML needs to be able to have it 
both ways.  Companies already using SGML and developing SGML tools need to be 
encouraged to accept XML - not as a replacement for SGML, but as something to 
take seriously. The larger non-SGML community, however, needs to be given XML 
as something new and different.  XML should not just carry in SGML's 
reputation as a complicated, slow-to-develop, and difficult-to-implement tool 
of the Federal Government.  XML evangelists need to be able describe the 
problems that XML fixes and how it fixes them, without reference to enormous 
systems that SGML has created in the past.  

>So XML says it is SGML. Furthermore, the recent correction to SGML (WebSGML), 
>is in its next-to-final draft before release (it has already been voted)
>means that there should be no doubt that the national standards bodies 
>involved with ISO want SGML to be XML-accepting too.  I have attended ISO
>meetings on this, and the ISO people certainly do not see XML as something
>independent of SGML either.

XML says it is SGML.  Fine.  But should the future development of XML be aimed 
at gradually including SGML features, or should it be aimed at meeting the 
needs of the developing XML community?  I expect the XML community in six 
months to a year to be rather distinct from the SGML community and hopefully 
quite a bit larger.  This issue will grow; we'll see what the W3C and ISO do.

>The complexity of unadorned SGML and the generality of its toolkit approach
>is the thing that made it dificult.  The very thing that makes you rich makes
>you poor.

And conversely, the thing that makes you poor will make you rich. HTML took 
off because it was brilliantly simple.  (There were plenty of other factors, 
of course, but simplicity was key.)  SGML has done very well in sectors that 
were able to make the investment in learning SGML, developing in SGML, and 
creating systems around SGML.  XML has the opportunity to take its much 
simpler toolkit to a much larger audience.  Simplicity is key to reaching that 
larger audience; adding SGML features, even with an on/off switch, is likely 
to confuse new users of XML while still disappointing the SGML community.

>But if a company wants
>to use something more powerful at their back-end, why shouldn't they use
>a more powerful language nearer SGML if that serves their inhouse needs 
>better.  And why shouldnt Microsoft allow this in their parser?

If a company wants to use something more powerful, why don't they consider 
'real' SGML an get a parser designed for that instead of creating documents 
that are called XML but are no longer XML?  Using this suggestion effectively 
will require a new series of standards to define what features of SGML have 
been added to a set of documents so that people don't blindly run them through 
XML parsers with the switch set wrong.  Data interchange will be a mess, once 

>XML development has been an exhaustive analysis of every part of mainstream
>SGML.  And I think almost everyone on the SIG would agree that there are
>good reasons for almost all the non-intuitive parts of SGML.  However, the
>need to be straightforward (the #1 goal of XML) means that there is 
>a different cost/benefit trade-off for deciding what should go into the
>base language (compared to SGML in the early 1980s).

There is a completely different cost-benefit analysis.  XML is the grand 
opportunity to extend generalized markup to a far larger audience than exists 
today.  There may be good reasons for almost all the non-intuitive parts of 
SGML, but the fact remains that these non-intuitive features have been 
barriers to use and development.  After reading some of the ISO specs and too 
large a chunk of the SGML literature, it became quite clear to me why SGML 
never percolated down to small companies and developers.  It's too complicated 
to be used without considerable upfront investment.

>The English-using world already runs on SGML.  Computer chips, air
>transport, legal systems, the military, many stock markets, 
>much print media, diagnostics of office equipment, and (with HTML 4.0) 
>WWW.   Any claim that SGML is not good for what it has tried to do 
>are wrong, as far as the market has spoken. 

The market has spoken that SGML does a great job for managing enormous amounts 
of information.  It has also spoken that SGML presents enormous barriers to 
entry (steep learning curve, cost of development, etc.) that have kept a lot 
of people from using it.  SGML does a great job in many systems.  The "many" 
there, however, is a tiny select few compared to the many that a simpler 
syntax (i.e. XML) could reach.  The scale of those projects is very different 
from those XML makes possible.

>And, in any case, the distinction between SGML and XML people is entirely
>spurious.   If you use XML, you are an SGML person.

This distinction will grow as XML is adopted more widely.  Visit the high-end 
web development mailing lists and you'll find an incredible amount of 
hostility to SGML but a simmering interest in XML.  If you use XML, you are 
using SGML tools.  This does not make you an SGML person.  As you may have 
detected, I do have a certain amount of hostility toward SGML and SGML 
culture, while remaining very enthusiastic about XML.

>SGML is not the enemy.  The enemy is poorly described data that is no use,
>and systems that are inappropriately complicated (or simple) for their
>user requirements.  SGML is merely a toolkit for constructing markup 
>languages, which includes a lot of features that are not relevant 
>to delivering structured data over the Web. 

XML appears to be addressing the problems with SGML that have kept it from 
being used by a wider audience.  Poorly described data is the real enemy, of 
course.  Attacking that enemy in a larger sense requires a reconsideration of 
the weapons we have used previously and a refinement.  XML's simplicity will 
encourage a large number of people to describe their data properly, people who 
wouldn't have bothered with SGML.  

This is an improvement, and the SGML community deserves great credit for the 
effort they have poured into building a simple but useful toolkit, which 
avoided the byzantine complexity SGML proposals are known for.  XML is more 
than just SGML, however.  XML is going to bring a lot of 'bozos' into the 
field of markup, people who care neither about the history nor the theory and 
just want to get things done. A different attitude and different needs will 
very likely increase the demands for XML to find its own voice.

I could, of course, be dead wrong.  We'll know in a couple of years.

Simon St.Laurent
Dynamic HTML: A Primer / XML: A Primer (January) / Cookies (February)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list