What is XML for?

David Megginson david at megginson.com
Fri Jan 29 20:28:33 GMT 1999

Simon St.Laurent writes:

 > >The question that we're discussing is not whether the rich
 > >recursive and hierarchical structures that XML can model are
 > >useful (I know from eight years' experience that they are), but
 > >rather, whether XML itself -- that is, text files conforming to
 > >XML 1.0 -- should be used as the primary storage medium for large
 > >applications.
 > That may be what you're hearing, but I think your assumption that
 > XML is a simple file format - that XML documents must be stored in
 > enormously long serial text files - is limiting your perspective
 > too much.

We're just mixing up terms.  XML documents *are* serial text files --
that's all that they can be.  As soon as you slurp them up into
alternative storage, they're not XML documents any more.  They're just
as important, just as interesting, and just as useful, but they are no
longer in a format defined by the XML 1.0 specification, so literally
speaking, they're not XML.

In general, few high-speed, large-scale applications can afford
repeated passes through serial text files (or even random access
through reverse indices), so using XML (in the literal sense) for
primary storage is impractical; there are, of course, exceptions --
for example, small bits of XML can be stored as blobs in relational

Paul's point, however, is that what's exciting about XML is what's
exciting about working with recursive, hierarchical data structures in
general.  When people talk about non-textual XML, that's usually what
they're really talking about.

Personally, I have a strong LISP background going back to 1987 and an
SGML background going back to 1991.  I share Simon's excitement very
much -- the world is a fascinating place when we let our data break
out of tables.  Here, for example, is a simplified natural language
parse tree as a LISP list (the kind of thing we were playing with 12
years ago):

  (modifier "The")
  (modifier "old")
  (noun "man"))
  (verb "sat")
   (preposition "on")
    (modifier "the")
    (noun "bench")))))

The AI movement in the 1970's and 1980's lived and breathed this
stuff.  It's easy to see how XML can provide an excelling language-
and system-independent representation of this structure, but the idea
of modelling information this way does not originate with XML, any
more than it originated with SGML or with LISP.
What Simon is saying, I think, is that he likes working with this kind
of information, and that he would like to refer to the general idea of
recursive, hierarchical data as "XML".  I suppose it's OK -- I still
want to call it "LISP" sometimes, but I'm afraid that people will

All the best,


David Megginson                 david at megginson.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

More information about the Xml-dev mailing list