XML complexity, namespaces (was WG)

Marcelo Cantos marcelo at mds.rmit.edu.au
Wed Mar 24 01:17:20 GMT 1999


On Tue, Mar 23, 1999 at 06:53:00AM -0500, David Megginson wrote:
> In his message, in a part that I'm not quoting (I do respond to
> specific details below), Marcelo Cantos argues that it's not for us
> to decide whether both full SGML and XML can co-exist, and I agree
> -- I am simply predicting that the market might not find it
> worthwhile to continue developing two standards that are
> architecturally identical and differ even in the implementation
> details only in nit-picky ways.

Well, I guess we are all entitled to prognosticate.  My own personal
view is that SGML is useful enough (over and above XML) in enough
serious systems that it will not go away in the foreseeable future
(which, admittedly, isn't that long in this industry).

> Choose an arbitrary number for the cost of containing to develop two
> standards rather than one -- say, US$100M/year (if all of the big
> enterprise vendors have to develop, test, debug, document, support,
> and maintain both full SGML and XML versions of their software, as
> well as donate employees' time to committee work) and unaccountable
> additional hours of free time donated by OSS writers.

I personally doubt that the maintenance of two standards will have any
noticeable impact on implementors.  Our internal libraries are, for
the most part, built and work nicely with either format.  Furthermore,
the major implementation effort involves the commonality, not the
variability between the standards.

As for the perspective of the standards architect, I can't make any
real judgements on how much work is involved there.  I would, however,
speculate that standards are driven by demand more than by economics.

> Do SGML-specific features like SHORTREFs, data attributes, and
> omissible tags sometimes make life simpler for implementors?  Of
> course they do.
> 
> Are the differences worth US$100M/year (or whatever number you
> pick)?  I don't know, and the decision is not ours to make, but the
> market will figure it out soon enough.  Whatever happens, there will
> certainly be money to be made from supporting the existing SGML
> installations, so there will be good justification for
> backwards-compatibility in some major tools.

And we still encounter new clients with new projects that are opting
for SGML because XML doesn't satisfy their needs.  The usual reason is
having to deal with legacy data.  But then one must ask how soon do
you think legacy data will go away?

I should point, however, that I am not arguing that SGML will
continue to dominate the market.  I believe that XML will increase
dramatically in use and will ultimately become the dominant player by
a wide margin.  What I disagree with is the notion that SGML has no
future role to play and will not be supported.

> Now, on to the specific points...
> 
> Marcelo Cantos writes:
> 
>  > > XML does nothing that SGML cannot do.
>  > 
>  > When developing the TOC management system for our document
>  > fragmenting toolkit, we chose XML to represent the TOC.  SGML was
>  > not an option, because we didn't know the content model in
>  > advance and couldn't build it automatically from the DTD's of the
>  > individual documents.
>  >
>  > Also, we couldn't use a homogeneous element tree with attributes,
>  > because we actually extracted structured content from the
>  > documents for insertion into the TOC (sure, we could have
>  > serialised the content into an SGML attribute, but that would
>  > have a been perverse and painful alternative to simply using
>  > XML).
> 
> There are work-arounds that you could have used in SGML, such as
> synthesised DTDs using ANY.  Both SGML and XML *can* do this, but in
> your case, XML makes it a little easier (as would WebSGML).  The
> differences are important to us, as SGML/XML implementors, but would
> not really concern the architect of a large system except to the
> point that they affected maintainability.

But would you then seriously suggest that maintenance is not a
significant component of a project's cost?

Of course SGML can do it, but the question boils down to whether it's
worth it.  We, as implementors, consider it far more cost effective to
maintain two standards (the cost is really quite minimal, IMHO) than
to insist on one or the other.

To say that SGML does everything XML does is ignoring the fact that
implementation details really do matter.  It is like saying that a
spreadsheet can do everything a word processor can.  Of course it can,
but that's not the point.

In any event, since the issue is whether XML will replace SGML, not
vice-versa, the "XML does nothing that SGML cannot do" comment is a
bit of a red herring.  The latter statement is far more pertinent.

>  > > SGML does nothing that XML cannot do.
>  > 
>  > On several occasions I have had to import textual information,
>  > and have been able treat the data as SGML with appropriate choice
>  > of shortrefs.
>  > 
>  > With XML I would have been forced to write an intermediate
>  > translation layer and would have consequently lost the originals
>  > (or been forced to store the original and transformed document,
>  > or add the extra layer to every access).
>  >
>  > True, they are not always adequate for the job, but I certainly
>  > would not have happily forgone them in my project because they
>  > wouldn't have been useful in someone else's project!
> 
> Or you could simply have defined a round-trip mapping --
> tab-delimited fields map to <item> elements map back to
> tab-delimited fields.  You could also, with XML or SGML, point into
> the original without altering it (HyTime provides good mechanisms
> for doing that in SGML or XML).

So what you are saying, effectively, is, why not add an extra layer,
and use it on every access?  I guess the simple answer is, I'd rather
not.

You are suggesting complicated solutions to something that was
inherently simple to solve!  Sure, we could have done all those
things, and it would have dramatically increased the workload.  We
would have had to add that extra layer, or bring in additional
technologies.

Even from an abstract perspective, the solutions you are offering
cannot, by any stretch of the imagination, be considered to fall under
the "SGML does nothing that XML cannot do" premise.  In reality they
involve drawing on additional tools and technologies to make up a very
real shortfall in XML's capabilities.  This merely emphasises the fact
that SGML and XML are _not_ the same thing.

> Again, however, Marcello is writing about implementation details,
> not about what SGML and XML are capable of representing in the
> abstract.  In this case, SGML makes life a little easier for a
> *very* experienced designer under high-specialised circumstances.

Actually, the tab-delimited stuff was one of the first problems I
encountered when starting to use SGML.  But such an answer would be
something of a diversion.  The real point is that SGML _is_ for
experienced designers under highly specialised circumstances.  If they
aren't working under such circumstances, then by all means use XML
(which is what most of our clients are, in fact, doing)!

> Lexically, SGML and XML differ in minor ways; logically, they are
> essentially identical.

And I reiterate, implementation matters.


Cheers,
Marcelo

-- 
http://www.simdb.com/~marcelo/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list