Storing Lots of Fiddly Bits (was Re: What is XML for?)

Paul Prescod paul at prescod.net
Wed Feb 3 10:04:05 GMT 1999


"Borden, Jonathan" wrote:
> 
>         You have missed the point here. If I put a DOM interface onto a SQL Server
> or Oracle or ODI or Poet database, I am hardly using an API to the
> serialization structure. When people say this, they mean that the DOM
> API/interface is used against the native datastore. The utility of this
> would demonstrate itself in a distributed environment where something like
> XQL was used as a query language. If we are in the relational db world,
> ODBC/SQL 92 provides an interface onto disparate databases. Not all
> information is stored on relational dbs. The DOM interface aims to provide
> the same database and vendor neutrality and interoperability that ODBC or
> JDBC provides for tabular data. If I am using a DOM interface, it frankly
> doesn't matter what the serialization format is, I am interacting directly
> with data through an interace.

You are interacting with data through an interface that was designed to
provide access to the abstract data model of a *serialization*. In other
words you are treating your database as if it were the result of parsing
an XML document. You've put an "elements and attributes" interface on data
that is much more complex than elements and attributes. If it were not
more complex then elements and attributes we would not need stuff like
XLink, HyTime, Namespaces and RDF to even *attempt* (and fail) to
represent it. The XML data model, whether a grove or a DOM is the "Forrest
Gump" of representations for your data. Sending a dumbed-down message by
Forrest Gump is good: he will relate it faithfully. Installing him as the
only conduit for information is bad. You'll have to dumb down too much
information and spend too much energy re-assembling it on the other side.

>         I wouldn't suggest that the DOM replace ODBC, yet I'm quite sure that those
> experienced using a variety of systems with disparate data types and data
> usages will appreciate that certain types of data are best expressed in tree
> format. Such data scenario's might best be interfaced with via the DOM.

You just need an API for "tree formats". Just ask your DBMS vendor to
provide some tree-structured API. It doesn't matter if that API is the DOM
because making it the DOM does *not buy you anything* as a programmer.
>From a programming point of view there is no benefit to working with a
consistent API where everything is dumbed-down to a textual model. You
might as well dumb everything down to an "object model." (see below)

If you buy this, then guess what the hype will be in three years: "These
new fangled data bases have this really cool feature, dude. You call it
with a SQL9X query and it can return like OBJECTS!. Everything in the
world can be expressed as objects! Lists of objects. Lists of objects.
Trees of objects. Directed graphs of objects. Arbitary graphs of objects.
It like unifies everything as objects. It's Zen, man. They call it 'JDBC'
and its totally wicked."

The *only benefit* of unifying things as DOMs is reusing software that was
originally supposed to work with XML (i.e. XSL implementations). If you
are writing new software it makes NO SENSE to do it through a DOM
interface unless your data source is *XML*. 

Otherwise, you should just define a "tree node" interface and have your
various objects implement it. You will get all of the the benefits of the
DOM with none of the costs (i.e. how the hell do you represent complex
properties of objects???). If you want some good hints about what a "tree
node" interface looks like, take a look at the grove abstraction.

>         XSL transforms can be applied directly to DOM representations, rather than
> serialized XML documents. This yeilds the possibility that serial transforms
> be applied within 'DOM space' (assuming the XSL transform output is a DOM
> structure rather than a serialized string). The act, thus, of web page
> generation from a database can be automated via XSL rather than, say ASP or
> perl scripts. Is this useful? Sometimes it is.

First, I don't believe that publishing databases to the Web is considered
a hard problem. Report writers, CGIs and application servers have been
doing it for a long time. So if all you are claiming is that the DOM
provides us with a small ease-of-use gain in solving an already solved
problem, then I can buy that. But I hear much grander claims for these DOM
interfaces. Nobody is saying: "solve simple problems slightly more
elegantly." They are rather saying: "unify your enterprise."

Second, Even *XSL* is not best served by a DOM representation. James Clark
wrote an xsl-list article about that but I can't find it now. Remember
that the DOM was invented as an extension of "DHTML." It's only half
"there." 

But if I grant that some well-thought-through API for XSL trees could
exist (i.e. Jade's grove API) then I would propose that it only be used as
an optimization in a system where it would otherwise make sense to pass
around serializations of text documents. i.e. the DOM is okay for skipping
a layer of message passing. It is not okay as a "universal API" for "all
of the data in an organization."

To bastardize JWZ: "Sometimes people have a hard data unification problem.
One part of their organization speaks a very different language (at the
data model and object model level) than another part. They might think 'I
can unify these with XML or the DOM.' Now they have two problems." 

There second problem is that they didn't understand the really hard
problem in their organization. Data model unification is *easy* (cast to
java.lang.object or w3c.dom.node). Data model *rationalization* is very
difficult. And I don't think that there are many shortcuts.

>         Are the DOM interfaces the best for all situations, clearly not. However if
> a significant percentage of people can agree to use them a significant
> percentage of the time, this is a big win.

That's not going to happen. The DOM will NOT be a core tool for that
majority of OO programmers this year, next year, or ever. Programmers will
try it and increasingly find that if they are not doing XSL styling for
the Web or print that the DOM is not a core tool. "Old-fashioned" OO can
provide the same benefits.

 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"Remember, Ginger Rogers did everything that Fred Astaire did,
but she did it backwards and in high heels."
                                               --Faith Whittlesey

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list