What is XML for?

Fri Jan 29 21:35:38 GMT 1999

Tim Bray writes:

 > At 03:27 PM 1/29/99 -0500, David Megginson wrote:
 > >In general, few high-speed, large-scale applications can afford
 > >repeated passes through serial text files (or even random access
 > >through reverse indices), so using XML (in the literal sense) for
 > >primary storage is impractical; there are, of course, exceptions --
 > >for example, small bits of XML can be stored as blobs in relational
 > >databases.
 > 
 > Well, I'm not sure.  Perhaps it's just because my perceptions were
 > formed by working on the 500-MB deeply recursive Oxford English
 > Dictionary text; but I think that a high-performance repository
 > that could accurately mimic the data structures observed in XML
 > would very useful in many (not all, obviously) applications.  I 
 > think I hear both Megginson and Winer expressing doubt on that 
 > front.  I'm surprised.  -Tim

No need for surprise; in fact, I agree with Tim: a high-performance
repository would be a Very Good Thing.  

Either this thread or another, closely-related one started with a
complaint that parsing an XML file over and over again for each
request put too heavy a load on a server; there was then a suggestion
that the XML could be precompiled into memory, or even stored in an
RDBM, at which last point a participant expressed regret that it was
no longer a pure XML solution.

My point, and (I think) Paul's, is that XML defines an external
representation for hierarchical information, not (except in very small
systems) an internal representation, but that many people are now
trying to use XML internally in inappropriate ways (such as parsing
the same static file several times each second).  My second point is
that the problem of storing, searching, and retrieving hierarchical
information long predates SGML and XML and is not XML-specific (though
it can come in an XML flavour if desired).

A high-performance, XML-aware repository would be a good thing,
because it could round-trip generic XML without significant loss.  So
there.

All the best,

David

p.s. I followed the OED work from Waterloo very closely in the late
     1980's (Frank Tompa might remember me) while I was programming
     the search engine for the 30MB Dictionary of Old English corpus,
     an hour away in Toronto; I even went to the point of trying to
     write some of my own optimised Patricia-tree implementations
     using the OED's algorithm -- nice idea for static repositories,
     but very brittle otherwise.

-- 
David Megginson                 david at megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)