XML complexity, namespaces (was WG)

Paul Prescod paul at prescod.net
Fri Mar 19 03:10:51 GMT 1999


Richard Goerwitz wrote:
> 
> I come from a small shop that does a lot of SGML work.  Trust me:  SGML
> is complex and intractable.  

<RANT>
This is way off topic but I must admit that these characterizations really
annoy me.

I can only speak anecdotally: I started using SGML while working for a
professor of English as an undergrad. A single programmer (not me) wrote a
pretty sophisticated application that converted SGML to HTML and RTF in a
couple of months -- almost exactly the same amount of time it would take
to do the same for XML. The process was almost identical too: you use a
parser from James Clark, pump the data into your favorite scripting
language and output it in the other language. The complexity of the input
syntax was and is irrelevant to solving that problem.

If we were doing that now it would be much, much easier because we would
use Jade. That proves that technology improves and it becomes easier to do
hard things over time which is pretty much unrelated to the distinction
between SGML and XML.

So anyhow the professor, my friend and I branched out and did some
consulting. So anecdotally I can say that two undergrads and an English
professor can figure out SGML and sell it to some Very Large Companies.

I wasn't expensive by consulting standards but compared to the other
undergraduates I billed out at a pretty high rate (not that I saw most of
that money!).

Was that because I was doing SGML? You bet. Was it because SGML was hard?
No. Almost everything I did then I would do today with XML in roughly the
same way.

I was expensive because SGML was fundamentally uncool and smart computer
science students could not be convinced to look at. So the industry was
dominated by technical writers, lawyers, humanists and other people who
had the vision of where they wanted to go but usually not the technical
skills to get there.

The companies we worked with would never have looked at us twice if we
were working with SQL or CORBA because those technologies are cool. If we
were working with SQL we would have got "summer job" rates instead of
consultant rates. We weren't doing anything more difficult than everybody
else, but we were getting paid more (at a cost of some pride). Now it
rather annoys me to be uncool again because I made the mistake of
ingeniously recognizing the virtue of (okay, stumbling upon) generic
markup a little too early.

Yes, many things are easier today. Part of that is the progression of
time. Jade is better than Omnimark for converting to RTF, modern SGML
editors are better than what we had a few years ago. 

Another important part is XML. It's all of a sudden cool to do markup
because the average programmer feels like they could make a parser if they
had to, even though the average programmer is generally too smart to waste
time reinventing that wheel. It's cool because it is associated with the
Internet. It's cool because Microsoft likes it.

I know what Simula's inventors must feel like. Sun repackages Simula
twenty years late and its treated as the second coming of Kernighan. Argh.

> Software that works with it [SGML] is scarce and
> often expensive, and too often doesn't work very well.  

That's often the case with emerging technologies. Software to work with
XML doesn't work so great yet either. The most sophisticated, solid
software I have that work with XML (e.g. Jade, Excosoft Documentor) was
all SGML software first. Do you have some counter examples?

> Just because a
> giant telco firm can muster the personnel to deal with SGML doesn't make
> it a particularly elegant solution, except by way of comparison with
> approaches that use non-standard or presentation-focused languages.

Elegance is pretty subjective but according to my jaded view neither SGML
nor XML are very elegant. The angle bracket syntax alone is annoying. The
strange dichotomy between elements and attributes is also odd. SGML and
XML make it possible to get at stuctural heart of documents. That makes
some things very easy. It makes some other things that were previously
impossible hard, but possible. 

The syntactic differences between them have so little to do with the
complexity of making industrial strength applications that I can only
conclude that those who think that SGML implementation is "hard" and XML
implementation is "easy" haven't actually got around to implementing
anything complex yet.

Re: Schemas -- it is 10 years later. We can probably improve on DTDs by
about 50%. We should do so. It doesn't make sense to wait for schemas in
order to implement a new system, and it is also not the case that they
will "revolutionize" the use and practice of XML, but it IS the case that
they will probably give us some nice features that will make our lives a
little easier. Great!
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"A year ago, when Ernest Pecounis said he wanted to bring
Linux into the state agency he works for, there was a swell of
laughter from his colleagues. Guess who's laughing now."
 - http://www.zdnet.com/pcweek/stories/news/0,4153,393443,00.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list