Groves, the next big thing (Re: ANN: XML and Databases article)

Michael Champion mike.champion at
Thu Sep 9 17:19:30 BST 1999

 >Groves are going to turn out to be like Linux, which began with a very
>few people who had a vision that turned out to work.

I'll start by making it clear that I've been on the W3C DOM working group
for more than two years, so you know where my biases lie ... Nevertheless, I
speak as someone who has spent a fair amount of time wrestling with defining
abstract data models and APIs for XML documents and considering what the
grove paradigm offers.

There are three basic reasons why the DOM is not more groves-like. First, as
someone pointed on earlier in the "XML and Databases" thread, not enough
people outside the hard-core SGML/XML community understand the groves
paradigm, so there was no general familiarity that we felt we should
leverage.  Second,  the available documentation for groves (at least a
couple of years ago) was mainly the DSSSL spec, which is very difficult for
non-specialists to make sense out of.  Third, there was a widespread
perception that the groves model implies, in DOM terms, that "every
character is a node",  and people concerned about implementing the DOM API
felt strongly that this would lead to unacceptable footprint and run-time

Groves may or may not become the next Linux; if this is going to happen, two
obstacles must be overcome.

Most importantly, someone is going to have to write a *clear* statement of
the paradigm, its power, why it's "the next big thing, etc.  Sortof  "Groves
for Dummies", or the "Grove Manifesto".  I can't stress enough the
importance of writing this for a fairly general audience. I recall
conversations a couple of years ago with very smart technical marketing
people at large companies who are now big players in the XML world; the
level of fustration they expressed at trying to make sense out of things
like the DSSSL spec was quite memorable!  I have not read the recent
attempts by groves adovocates to offer tutorials, etc., so forgive me if
this has already been done. I frankly doubt it, because if a clear and
compelling case for the groves paradigm has been made, it hasn't come to the
world's attention.

Also, even if the "grove paradigm" is a fundamentally more powerful way of
looking at XML and other types of data than what is in wide use today, it's
unlikely to be adopted unless there is a clean migration path from familiar
APIs like ODBC/SQL, the W3C DOM, the (forthcoming?)  JCP XML data binding
spec, etc.  One of the most eye-opening aspects of my experience on the DOM
WG has been to understand that most users of Web scripting languages, Visual
Basic, etc. know very little about computer science.  I began my DOM
"career" assuming that everyone who would be using such APIs understood tree
and graph data structures and would understand how nicely they represented
the types of things we were talking about.  I was quickly set straight by my
colleagues from companies with larger customer bases:  ARRAYS are about as
sophisticated a data structure as the typical Web scripter or VB programmer
can handle.  [I *know* that someone reading this wants flame me, but rest
assured that I don't like this notion any more than you do, and just about
every conceivable counter-argument has been raised, and very reluctantly
dismissed, by the DOM WG already.] So, I would *love* to see someone define
a grove API that extends the DOM, and/or to see the grove paradigm cleanly
incorporated into the Java Community Process XML data binding, and/or to see
a repository-friendly API that builds from ADO or JDO and incorporates
groves concepts.  But don't expect the typical consumer of XML APIs to be
impressed by a groves API that offers a "new paradigm" that assumes that the
reader understands graph theory and data structures and builds up from there
with little reference to existing efforts.

Mike Champion

