XML complexity, namespaces (was WG)

Paul Prescod paul at prescod.net
Fri Mar 19 16:09:11 GMT 1999


David Megginson wrote:
> 
>  > I can only speak anecdotally: I started using SGML while working
>  > for a professor of English as an undergrad. A single programmer
>  > (not me) wrote a pretty sophisticated application that converted
>  > SGML to HTML and RTF in a couple of months -- almost exactly the
>  > same amount of time it would take to do the same for XML.
> 
> Actually, many such applications were often written in a few days or
> even a few hours.  

In defense of my friend, this one was pretty slick, with a graphical UI
and used C++ for really high throughput. Actually, she was mostly a C++
bigot so that part isn't completely defensible. Ironically, I encouraged
her to learn Perl before I had attempted to do so myself. Imagine my
surprise.

> The interesting thing about SGML is that it was
> heavily used in two separate markets at extreme ends of the scale:
> 
> 1. academia, for large, low-budget projects using free software (like
>    Emacs, NSGMLS, Perl, and Jade) or cheap software (like WP7); and
> 
> 2. government/military/heavy-industry, for large, high-budget projects
>    using extremely expensive commercial software (like ArborText and
>    Omnimark).

True enough. I expect that there will be a certain amount of this with XML
also, however. You need a certain critical mass of problem complexity
before it makes sense to implement a generic markup-based solution to a
document processing problem. Despite what Chris Lilley says, it *still*
takes a text editor to get data into XML and a consultant (or internal
expert) to get it out. Properly structured XML requires transformations to
turn into beautiful print. XSL is easier than what we had three years ago
but it still isn't something your typical office user will learn. But
again, the difference is that XSL is cool so programmers flock to it.
Perl+SGML/Omnimark was not cool so people with the expertise were
expensive.

> In general, the academic projects (and there are hundreds of them)
> accomplished much more using much less (often just a single PC on a
> grad student's desk), but that is partly because they never had to
> become too user friendly -- the researchers would work directly with
> SGML markup, rather than hiding it behind $20K/seat GUI tools.  The
> gov/mil/industry projects spent most of the money trying to hide the
> SGML from view -- the processing itself has never been difficult, SGML
> or XML.

One of the hardest things with XML *or* SGML is making usable user
interfaces. XML doesn't make it any easier. In fact it retains some the
SGML features that can do the most damage to an intuitive user interface
(consider internal entities in attributes).

> Almost correct.  One expensive disadvantage of SGML (until WebSGML) is
> that it requires full DTD conformance at every stage of production; as
> a result, if your production chain consists of ten physical steps,
> writing out SGML at each stage, you *must* have DTDs for all of the
> intermediate steps.  

Here's the DTD I would use:

<!ELEMENT INTERMEDIATE_P ANY>
<!ELEMENT INTERMEDIATE_HEAD ANY>
<!ELEMENT INTERMEDIATE_TITLE ANY>
<!ELEMENT INTERMEDIATE_XREF ANY>
...

Actually, I tend to write real DTDs for intermediate steps if I can.
Untermediate steps can add errors too. I agree that sometimes the
cost/benefit ratio isn't there.

> This one constraint can add $100K or more to a
> large enterprise SGML project, since DTD writers are expensive to hire
> (and a single, configured DTD becomes heavily obfuscated so that it
> can almost never be maintained in-house).

I'm surprised that you wouldn't allow the programmer who builds the
intermediate transformations to also build the intermediate DTDs. I
consider the DTDs to be part of the specification for what the program
does.
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"A year ago, when Ernest Pecounis said he wanted to bring
Linux into the state agency he works for, there was a swell of
laughter from his colleagues. Guess who's laughing now."
 - http://www.zdnet.com/pcweek/stories/news/0,4153,393443,00.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list