Storing Lots of Fiddly Bits (was Re: What is XML for?)
keshlam at us.ibm.com
keshlam at us.ibm.com
Wed Feb 10 23:31:52 GMT 1999
I think folks are reading much more into the DOM than they should. Can we
step back from religion to programming practice for a moment?
The DOM is an API for random access to structured documents. It is only an
API. It may be wrapped around any back-end storage representations you
consider appropriate. If you have a random-access model of your document,
putting a DOM interface on it gives folks a standard way of accessing it.
Note that the DOM is defined in terms of interfaces rather than classes;
there doesn't have to be a 1:1 mapping between the two, as long as when
folks ask for an Element (for example), it behaves like an Element. The
fact that it also behaves like a Document, and/or a Swing TreeNode, and/or
whatever other behavior your implementation cares to add to it, doesn't
matter to the DOM.
There are off-the-shelf data models that implement the DOM; my own (which
XML4J is moving toward) is one instance thereof. These are offered as a
convenience, just as the default models behind Sun's MVC-based Swing
widgets are offered as a convenience. If you don't already have a data
model with a DOM API, and one of these suits your needs, you can plug it in
and run. If an off-the-shelf DOM _doesn't_ do what you need, it may allow
you to subclass and extend it. Or you may want to plug in someone else's.
Or your own. Using DOM as a standard API around the model gives you the
freedom to swap that component without changing your other code. (That
theory is hung up on some places where the DOM Level 1 spec is incomplete,
but Level 2 should close the gaps.)
If your application don't have a random-access model of the document, the
DOM isn't relevant. You _can_ use it, but you can also use SAX or other
solutions. Pick the approach that suits your needs. A good parser should be
able to yield both DOM and SAX, equally smoothly. IBM's XML4J is moving in
that direction, though early versions were very DOM-centric.
As others have said: DOM performance depends on what kind of model the DOM
API is wrapped around. There will not be any single "best" DOM
implementation, since different applications have different needs. Some
DOMs will specialize in performance, perhaps tuned for particular tasks.
Others will specialise in minimal codesize (perhaps for fast download in an
applet), or minimal storage use for the document model (for handling large
documents in constrained machines). Still others will be wrapped around
existing models (databases and so on), provided for compatability with
DOM-based application code even if that isn't the best possible way to
access this particular model. You really can't make any statements about
DOM performance without saying precisely which implementation you're
talking about... and as with other software components, you'll pick the one
that suits the task you want to solve.
The Document Object Model is just a tool -- as is XML, for that matter.
Decisions to use or not use it should be made precisely the same way
decisions to use or not use XML are made. If it fits your problem, using it
gives you a place to plug in other off-the-shelf solutions. If it doesn't,
use something else.
There's nothing wrong with SAX (though it too needs another turn of the
evolutionary crank, in my opinion), but SAX is a stream rather than a
model. The two really aren't in competition with each other any more than
sed is in competition with vi -- they're each good in their own target
domain, and there are even times when using one to generate the other is
the right answer.
Reality is fractal. Absolutes are almost always false.
Joe Kesselman / IBM Research
Unless stated otherwise, all opinions are solely those of the author.
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev