Storing Lots of Fiddly Bits (was Re: What is XML for?)

Clark Evans clark.evans at manhattanproject.com
Thu Feb 4 04:13:25 GMT 1999


Jonathan Borden wrote:
> What are you trying to say here?  
> Are you criticizing objects?

You can't always treat a stream as object.  If you do, you
loose significant power.

> Suppose I want to process the data using XSL? Is this conceivably an
> acceptable reason to use a DOM interface (assuming I don't actually want to
> convert my database to serialized XML itself).

I would see this as the last thing you would want to do. 
However, I don't have XSL experience, so someone
with real-world experience would be a better spokesperson.

DOM requires the entire stream be read before the 
the document object is returned and processing can begin.
Not only does this chew significant memory for very large 
streams, but it causes significant delay before output
could be generated.  In the worst case, it turns a 
perfectly simple problem into an "impossible" one 
where the memory requirements and time delay make 
the solution useless.

If the stream is only going to be "filtered", why read
the entire thing into memory before starting the 
transformation process (in this case filtering)?

> Certainly XSL is best served by a DOM representation if 
> the data is presented via a DOM interface. 

I would speculate to the contrary, and would think that
driving XSL with SAX would be a far better choice.

> The other option is to serialize everything.

No.  The option is to move to Event based processing
of streams.  You can then model with "event objects"

> This makes no sense unless the DOM implemention is sub-optimal.

No.  It's a computational complexity issue.  For a
decent size stream, with a transformation that can
be done in a single-pass (XML->HTML), no DOM 
implementation will even come close to an implementation
using SAX.  Crunch some numbers.  

If your still not convinced, read Ableson, Structure and
Interpretation of Computer Programs, ISBN 0-07-000484-6,
Section 3.5.1, page 317.  There he talks about:

	"severe inefficiency with respect to both time and space".

Best,

Clark Evans

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list