Call for unifying and clarifying XML 1.0, DOM, XPATH, and XML Infoset

Nils Klarlund klarlund at
Fri Jan 28 17:17:56 GMT 2000

Lars wrote in response to something I must have garbled:

> | In fact, even the XML quintessence, trees, is not a clear sell:
> | recursion and trees are a standard part of a computer science
> | curriculum, but these concepts are not easily swallowed by all.
> I think you have to elaborate on this; at least I have no idea what
> you are referring to here.

In my experience, undergraduate students must sometimes struggle quite
a bit just to understand trees and their classical algorithms.  The
challenge is to convince the outside world (to take up the thread by
Meggison) that XML is an utterly practical interface to ideas that
seemed too abstract in school.  

The XML golden standard that defines the canonical tree model should
be a technical propaganda piece: I would put a little XML document
(like the one from the Infoset) at the beginning together with a cute
little drawing of the tree that the document represents (and, the tree
represents the document, nothing more, nothing less, please).  Then, I
would proceed by explaining that this tree is of the form tree
T=(V,E,<) etc. as taught in school and where "<" is the document order
on element and text nodes. The model could be explained then in half a
page or so.  That would be the path of least resistance I think.  And,
it would help establishing the legitimacy of XML to the academic
world, since then even computer science professors would be able
understand it:).

> | I am not qualified to comment on SGML itself, but even XML 1.0 does
> | appear to be suffering from over-conceptualization (too many
> | concepts that don't fit together too precisely).  As a simple
> | example, look at content models:
> | 
> | - a content model is not a model for content in general, but only two
> |   kinds of content, namely elements and character data, not processing
> |   instructions and not comments (incidentally, it could have been
> |   termed "markup model" as well I think, since markup is a more
> |   general concept than content)
> This criticism is undeserved, I think. Content models describe the
> allowed _structural_ content of an element. 

OK, there is another concept called "structural content".  That seems
to be an important one.  We should understand a "content model" to be
an abbreviation for "structural content model".

> Comments are not affected since they are not considered part of the
> document at all, which definitely makes sense. 

No, I don't think so.  Some of the XSLT programs that I have written
also transform the comments, which are certainly a part of the
documents that I write.  The distinctions you are making are valid in
your domain, but I think it is easy to argue that they are confusing
in general.

> As for mixed and element content I don't really see how that qualifies
> as two concepts. It's quite simply a means of saying where text is
> allowed. Admittedly the allowed forms of mixed content models is
> something of a special case, but that has its reasons and doesn't have
> anything to do with over-conceptualization.

Yes, it does, because you are trying to force SGML experiences upon
XML by minting them as more concepts.  I don't know what these
experiences are, neither the annotated XML 1.0 specification, nor the
SGML Handbook I just lugged into my office offered me an explanation
that I could understand.  And, whatever they are, they are probably
not relevant to XML as a general data exchange language.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as: or CD-ROM/ISBN 981-02-3594-1
Unsubscribe by posting to majordom at the message
unsubscribe xml-dev  (or)
unsubscribe xml-dev your-subscribed-email at your-subscribed-address

Please note: New list subscriptions now closed in preparation for transfer to OASIS.

More information about the Xml-dev mailing list