XML Database

Jean-Claude Wippler jcw at equi4.com
Fri Sep 25 18:01:35 BST 1998


> We sometimes encounter a client who belives that making
> XML(SGML)Documents means making Database. What they believe is if
> they make their document in XML they will easily be able to find
> information in it, and to manage it. 
> We are having hard time to make them understand if they need data
> base, they have to develop a data base system.
> Then, they loose interest in XML(SGML). What they want is easy and
> versitle data base system.

As others have said, XML is an interchange format - i.e. the stuff that
matters when you either transport information, or when you wish to file
it without knowing/caring how it is going to be used in the future.

It's also a "markup language", of course.

> Do you have any good ideas to show the advantage of XML as the data
> base format to this kind of people?

Raw speed.

> Are there any experience or information regarding XML(SGML) data base?

Very, very tentatively, I'd say: yes, I've been doing some work in this
area the explore the field.  I'm the author of a cross-platform storage
manager for structured data, called "MetaKit".  Right now, there are
interfaces for C++, Tcl, and Python (the library is written in C++).

I wrote a small utility called "mk4xml", which reads any XML document
into a flattened tree-structure (using "expat") and saves that as a
MetaKit datafile.  As an experiment, I wrote a small Tcl script called
"ot_conf.tcl" which takes such a MetaKit datafile, generated by mk4xml
from the "ot.xml" (Old Testament) document, and converts it into a
nested datastructure that matches this specific document's DTD.

You can see some of this in motion at:
but as you'll see this is proof-of-concept stuff at this point...
there's not even a decent index page there yet.  There's a "summary.tcl"
script which collects some stats on files generated by "mk4xml".

The results are interesting:

    File sizes:
	ot.xml			3.9 Mb
	mk4xml ot.xml result:	4.1 Mb
	ot_conv.tcl result:	3.1 Mb

    Access speeds:
	ot.xml		(take your pick, parse in seconds up to minutes)
	mk4xml ot.xml result:	opens in 1.4 sec  (on a fast PII/400)
	ot_conv.tcl result:	opens in 60 mSec  (on same system)

    Access method:
	ot.xml			SAX/DOM, usually linear scan
	mk4xml ot.xml result:	random access by element/subelement/...
	ot_conv.tcl result:	random access by book, chapter, verse
> Are there any good XML data base in commercial basis?

Being such a general question, I do not feel qualified to answer this.

> If there is such Database, how much will it cost? 

My first reaction would be: could be anything.  The current market for
XML software runs from free to 5-digit dollar amounts, and commercial
businesses being what they are, the rule in a new field like XML is very
likely to be "anything you can get away with"... oops, that's not a nice
thing to say, let me rephrase that as "what the market will bear".

MetaKit is free for non-commercial use, with binaries available for
Unixes, Windows, Mac, VMS, and royalty free source code licenses for
commercial use, at a price level which seems to cause some people to not
take it seriously... see the website for details.  Be your own judge.

Feel free to contact me for further details.  At this stage, suggestions
and comments are most welcome - as I said, it's just a sneak preview...


Jean-Claude Wippler    MetaKit home page - http://www.equi4.com/metakit/
Equi4 Software         "Portable database software for a changing world"

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

More information about the Xml-dev mailing list