Opinions requested

W. E. Perry wperry at fiduciary.com
Sat Mar 6 17:19:49 GMT 1999


Marcelo Cantos wrote:

> Thank you, Walter for the erudite response.  I am left in a bit of
> quandary as to how or even whether to respond.  This is in large part
> due to the fact that, while your post was in response to mine, it is
> not immediately clear to me whether you are addressing my comments
> specifically or rather the general theme of this thread.

Thank you for your kind words. I will confess that much of my post was addressed to the
general theme of the thread.

> On this point, I must make it quite clear that SIM is _not_ an XML
> front end to a data store.  It is an XML (etc.) document repository.

My naive reading of the SIM materials on your website leads me to this conclusion. I am glad
to have your confirmation of it. As a document repository SIM may more nearly compete with the
'grove minder' paradigm than with what I characterize as an XML database.

> One additional, crucial point is that SIM _is_ extensible (though I
> will qualify this presently).  It can be defined to accept markup to
> any degree of strictness or laxity (within the bounds of
> well-formedness or validity, of course).  It can be setup to accept
> any and all markup and do _something_ intelligent with it.  It can
> also be configured to make stringent demands (well in excess of the
> DTD, both with respect to strictness and complexity of constraints) of
> its inputs.

Granted. It is simply that I (perhaps perversely) have defined an XML database engine as one
which implements XML markup. My XML database engine is driven by the markup and must rework
the effective schema and re-cast its processing behavior in sync with changes to the document
instance markup.

> Now, by way of qualification, SIM does not provide free-form runtime
> extensibility (runtime from the administrator's perspective, not
> ours).  Rather it provides the application developer with the
> requisite tools to define, at design time, what structures will be
> supported.  For instance, you cannot, with SIM, perform queries such
> as, "find me all sections containing subsections with an attribute of
> security="public" and at least one paragraph with fewer than four
> words in it"  The semantic complexity of such a query is beyond the
> scope of our product.  However, if one were to know in advance that
> queries about the minimum paragraph length in public subsections will
> be commonplace in the particular application one is developing, then
> SIM could, at design time, be told to create an appropriate index and
> then the above query could, indeed, be performed.
>
> In short, SIM _is_ extensible, but the extensibility is bound somewhat
> earlier than runtime.  In practice, clients never complain about this
> quality.  In fact, it is usually a benefit rather than a hindrance,
> for the same reason that compile time type checking is a good thing to have in a programming
> language.

All of these are commendable design decisions. They are not, IMHO, realizations of the unique
qualities and potential of XML. On that, reasonable people may differ.

> I also take issue with Walter's remark that an XML database should be
> manipulated by and defined through the medium of XML.  This sounds
> analogous to suggesting that relational databases should be defined
> and manipulated by markup.

No, by relational schema, as you acknowledge in the next line.

>  Now, it is true that relational schema
> are, themselves, typically stored as relations (one will, for example,
> find a ".TABLES" table, a ".FIELDS" table, a ".INDEXES" table, etc.
> inside a database).  However, it seems to me patently absurd to
> suggest that SQL (whether DML or DDL) be expressed in terms of tuples
> and relations.  Now, while it does not seem likewise absurd to suggest
> that XML queries and data definition constructs be defined as XML, the
> truth of such a suggestion is anything but self-evident.  Why should
> one not use an SQL-like language to define and query XML databases?
> There may or may not be merit in such an approach, but it seems no
> more or less appropriate than a query/data definition language cast in
> XML.  Indeed, many of the query language position papers at W3C do not
> use XML syntax.  Data definition and query languages are
> meta-constructs.  They are not part of the data, but rather operate on
> the data and structures.  This suggests that while it may be possible
> to fold the system in on itself by expressing meta-structure as data,
> it would be unwise to proceed down this path in _a priori_ fashion

By following the path indicated by just such an a priori judgment I arrived at the conclusions
which I have shared with you. I am implementing the resulting design and, I suppose, the
almighty market will render the final verdict.

> The serious user of XML does not have a heterogeneous collection of
> vaguely defined documents with a motley crew of DTD's and well-formed
> markup.

That is exactly what I (and my customers, once we re-state their documents in various legacy
forms as XML) have to deal with. We process settlements of cross-border trades and the
regulatory reporting required by multiple overlapping legal jurisdictions. If I have advice of
a trade execution in the customary form used in, say, Djakarta, and the interested parties to
whom I must report it are a UK fiduciary, a Swiss depot bank, a US money manager and a Hong
Kong broker, as well as the various regulators which the involvement of each of those parties
entails, I must (in my opinion) drive the entire process off of a properly marked up document
which succinctly expresses the facts of the transaction reported. That document, received by
each of the interested parties, must be instantiated in the system--and I would hope the
database--of each in a form which may well require re-writing the schema upon which it will be
realized.

>  Most users have a well defined data set for which they want
> to define efficient structures for storage and retrieval (if they
> aren't interested in efficiency then their problem isn't particularly
> interesting -- any tool will do).  In the few cases where they do have
> arbitrary structure to deal with, more often than not they are only
> interested in the content and are likely to throw the structure away.

As I hope the use case fragment above illustrates, users may have very well defined
structures, well-suited to their specific needs. Those structures, however, may not
accommodate the instance documents which they receive as input data and which, in the
real-world examples I am familiar with, may exhibit differences of data structure on each
occasion.

Respectfully,

Walter Perry



xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list