XML complexity, namespaces (was WG)

Richard L. Goerwitz richard at goon.stg.brown.edu
Thu Mar 18 14:26:54 GMT 1999


len bullard wrote:

> It isn't the complexity of XML.  Truth is, it isn't that complex.
> It is the fit.

Speaking of complexity, I just updated STG's web-available validator
to cope with namespaces.  I'm not claiming that I got it right on the
first pass.  But the updates should help those of you experimenting
with the Jan 14 spec:

  http://www.stg.brown.edu/service/xmlvalid/

Re namespaces:

After working with them now for a few months, I can't say I'm any more
impressed with namespaces than when I started.  Why?

  --  No no matter what anyone says, they screw up validation.  --

    1) because DTDs aren't namespace-aware, and therefore
      a) don't know the difference between a defaulted element and one
         that simply has no namespace
      b) have no scoping mechanism to at least allow you to kludge
         namespace defaulting by restricting elements to one or another
         part of the syntax tree

    2) because namespaces require you to parse attributes and values
       fully before finishing element name processing; this is bad be-
       cause it
      a) makes one-pass parsing more difficult, and requires retention
         of much more information during the parse
      b) makes for unexpected interactions between the DTD (which may
         provide default attributes for a given element, including
         xmlns="" - which puts the element into a namespace)

    3) because inherited attributes are inimical to the whole DTD
       concept
      a) it was bad enough that we had to put up with xml:lang and
         such (which processing software must pass down the parse
         tree), now the XML standard itself has inherited attributes
         built in with namespaces

I have no issues here.  I'm not a W3C member, and we make no significant
use of XML here in my shop.  I'm basically just an interested observer.
And my observation is that namespaces screw up validation.

This is all very bothersome because validation is one of the key points
that separate XML from HTML, and potentially make it better.  With XML,
anyone can define their own HTML, so to speak, or another markup lang they
find useful, and then simply publish a DTD with it.  There's none of the
chaos of HTML, which didn't even get a DTD until it was in wide use, and
that (despite the DTDs it now has) typically doesn't validate.  It's to
the point where the only people who can write effective HTML processing
software are outfits with armies of programmers hired to deal with error
recovery and proprietary extensions (both their own and their competitors').

With XML, we can potentially start out on the right foot, and avoid this
nonsense by using validation from the start.  Well-formedness is nice,
but it's not clearly enough defined (and anyway, many non-validating
processors find it necessary to at least grab attribute defaults, if not
also look for parameter entities and conditional sections).  Using it
alone could easily put us back into an HTML-like mess.

So the problem now is how to encourage validation despite the fact that
the W3C has apparently shot DTDs and itself in the foot with namespaces.

The answer, obviously, is to shed any pretense of DTDs being the basic
XML schema mechanism.  We could waffle for years, claiming that both the
DTD and some other mechanism are "standard".  But what's this supposed
to do to the complexity (remember complexity?) of our processing soft-
ware?

It's not like it's any harder to construct a schema mechanism that
offers a superset of what a DTD offers, and then provide simple conver-
sion tools.

Yes, SGML compatibility was an original goal.  But a lot of original
goals seem to have gone out the window.  Another one isn't going to make
any difference now.

The only problem with this scenario is that it will horrify the old SGML
community, which looks to me as if it's trying to kludge architectural
forms onto XML, maybe in efforts to save DTDs.

It's all getting rather bizarre.  Again, I say this as someone who has
gotten with the program, and implemented everything the W3C has put out
(and who works in an SGML shop).  My boss isn't leaning on me to hack
out EDI code, or to whip up an RDF engine.  I'm really just a disinter-
ested observer.

-- 

Richard Goerwitz
PGP key fingerprint:    C1 3E F4 23 7C 33 51 8D  3B 88 53 57 56 0D 38 A0
For more info (mail, phone, fax no.):  finger richard at goon.stg.brown.edu

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list