Is XML dead already or what? Was: RE: What is XML for?

Sat Jan 30 22:34:19 GMT 1999

At 04:34 PM 1/29/99 -0800, Ed Howland wrote:

>I am aware of others who have implemented XML based e-commerce solutions. I
>only know of one actual methodology that seems to work for one of them. They
>store the data in XML but use an RDBMS to index it to some level. Then they
>parse the final file (or files) and merge them into a resultant XML stream.
>This is then formatted with CSS (XSL presumably not yet ready for
>prime-time) for presentation. Until I can find a better solution, this is
>the short term fix I'm going for.

What's wrong with this methodology? It uses each technology for what it's
best at: XML for standards-based data representation, RDBMs for high-speed
indexing and quering, CSS (or some other style mechanism) for dynamically
presenting the result of a query or other process.

Until we have a new form of data storage technology that combines the
structure optimization of XML and object-oriented databases with the
indexing and retrieval speed of relational databases, I don't see any other
obvious way to solve the problem of large-scale, high-volume information
retrieval. And don't forget the need for full-text indexing, which is yet
another technology dimension.

I've consistently said, for example, that the best way to manage
HyTime-style hyperlinks at scale is using relational databases.  HyTime
provides an implementation-independent *data model* that, if implemented,
ensures a rich and flexible set of linking and addressing information. But
because the data model is fundamentally a set of lookup tables, it seems
obvious to me that relational databases are, at least today, the best
implementation technology. Nothing the in abstract model requires that it
be implemented literally (that is, there's no requirement in the standard
that you actually hold the hyperlinking information in memory as grove,
only that you be able to provide access to it as though it were a grove).

I also take it as a given that commercial relational database tools are as
optimized as its possible to make them, for the simple reasons that the
technology is mature and customer pressure for maximum performance is very
high.

I've always liked an implementation approach that starts with a very
general, hopefully standardized data model and then implementing that model
in whatever way is most productive in the short term, secure in the
knowledge that by reflecting the more general model, you will be able to
re-implement without danger of data or function loss (assuming your new
implementation implements at least as much of the general model as your
current system).  Starting with an implementation-specific data model
always runs the risk that you've over optimized somewhere. You can
certainly see this in all the commercial SGML databases, where they failed
to reflect the general SGML data model and inadvertently made it impossible
to add features required by that model without significant rework.

Note that I'm *not* saying you have to implement the *entire* model, just
that whatever you do implement needs to be consistent with the more general
model and provide built-in paths to the larger model.  It also can't omit
those things that are fundamental to or required by the model.

I also believe, and my experience so far supports my belief, that it is in
almost all cases better to prefer generality and flexibility of
implementation over optimal performance. The only place this doesn't hold
is when the short-term performance requirements are so onerous that only a
maximally-optimized solution will do. But even then, you can define the and
document the gap between the ideal model and what you've actually
implemented so that when the technology catches up with the performance
requirements, you know which parts are short-term optimization hacks that
you can generalize.  

Generality and flexibility help ensure that the initial integration task of
building the system is most efficient and helps ensure that  the long-term
maintenance costs of the system are minimized, if for no other reason than
that the definition of the governing abstract model is explicit and
available, rather than being hidden in the code and whatever documentation
(if any) the original developers created.

Cheers,

E.
--
<Address HyTime=bibloc>
W. Eliot Kimber, Senior Consulting SGML Engineer
ISOGEN International Corp.
2200 N. Lamar St., Suite 230, Dallas, TX 75202.  214.953.0004
www.isogen.com
</Address>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)