ANN: XML and Databases article

Thu Sep 9 18:40:07 BST 1999

Ken North <ken_north at compuserve.com> wrote at 9 Sep 99, 9:08:

> Finally, the days when SQL DBMSs stored only columns of number and
> characters are gone. Some still do, but that is not a defining
> characteristic. Most of the major SQL vendors moved to a universal server
> model that supports rich types as well as traditional tabular data. For
> storing an XML document, you have the option of decomposing it or storing it
> as a whole.

Right, but the price is implementing stored procedures. I dunno how 
the good Oracle 8i interfaces are, but with Informix it's no fun. We 
did it, and abadoned it. The other problem is you can store XML, but 
have to explicitely model relations (parent, sibling, child, ... and 
all other axis you need) by foreign keys. The number of "metatables" 
with structural information quickly outnumbers the useful data. Again 
we tried, benched and canceled.

Is there a paper, page or software that can store XML along with it's 
structural information in a RDBMS, be it classic or OO-relational, 
you are thinking of ?

Currently we are investigating our PDOM, which just maps between a 
W3C-DOM and persistent streams of serialized Java objects (paper will 
be at OOPSLA99). Instead of modeling structure we try to scale by 
means of intelligent, specialized cache strategies. Results are 
promising.
Guido Moerkotte and C-C Kanne from the Univerity of Mannheim are 
working on intelligent clustering strategies, which try to keep trees 
in clusters suited for typical XML access patterns. They also have 
means to cluster better when given semantic user constraints on 
granularity.

While we currently work with the DOM, the results could be easily 
applied to Groves. However, giving up the DOM would mean to lose lots 
of handy middleware. Our XQL processor has to work on a well defined 
data structure, the DOM is sufficient for now. Paul Prescod in his 
Grove tutorial precisely points DOMs weaknesses, but there is nothing 
to replace it right now.

We discussed to use the Infoset as a data model, but its hard to 
write an optimizing XQL processor against a model where half the 
features are either optional or not present (schema stuff). 
Especially the fact you can't tell how many children a node will have 
(e.g. when some are optional, or only present when validating) is not 
what I call a well defined model for queries.

	++im

--
Ingo Macherius//Dolivostrasse 15//D-64293 Darmstadt//+49-6151-869-882
GMD-IPSI German National Research Center for Information Technology
mailto:macherius at gmd.de http://www.darmstadt.gmd.de/~inim/
Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Zappa)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)