Topic Maps on SQL

Steven R. Newcomb srn at techno.com
Thu Nov 26 01:54:19 GMT 1998


[Tim Bray:]

> You know, this goes straight to the core of a deep issue.  Where
> I have often felt out of sync with the grove/property-set evangelists
> is that I perceive syntax as fundamental and any particular
> data model as ephemeral.  I guess I must admit that I don't believe in 
> "information standards that are independent of lexical/syntax
> representation".

Far from being independent of each other, the interchange syntax and
the grove representation should be two mutually interdependent sides
of the same coin.  Information represented in XML's interchangeable
form is practically useless, but it's easy to preserve and it's highly
interchangeable.  XML information that has been parsed (and converted
into objects, for example) is very useful, but it is platform-specific
and not interchangeable.  Today we have the two sides, but no coin.  A
property set for XML would define, formally and inescapably, all the
relationships between the two sides; it would be the missing coin.

One simple and ineluctible reason why we need a property set for XML
is so that we can know for sure what a node is, so that we can count
nodes, so that XPointers can work reliably.  If the data model of XML
is "ephemeral", XPointers will not work reliably across dissimilar
applications; what an XPointer means will be a matter of opinion.
Unless we know precisely what everybody else regards as the
addressable components of an information resource, we can't reliably
express the addresses of those components.  The grove paradigm and the
property set formalism were developed in order to solve this very
problem, among others, for the SGML family of standards.  

The same intellectual tools will work perfectly for XML, although the
result of a consensus-seeking W3C process might be an XML property set
that doesn't resemble the standard SGML property set very much, if at
all.  But that fact only dramatizes the expressive power of the
property set formalism.  

(And, speaking only for myself, I can face that particular outcome
with equanimity.  It's vital to have a model and to express it so
formally and inescapably that conformance will be easy to verify, and
so that the meaning of an XPointer cannot be the subject of a dispute.
There could be many kinds of differences between the XML property set
and the SGML property set that would not threaten the viability of
XML.)

As I see it, the only alternatives to creating and adopting an XML
property set are:

(1) To create a new formalism pretty much like the property set
    formalism, and to use it to say exactly what the nodes are, and
    which syntactic constructs they correspond to.  I think this is an
    acceptable choice, because the only downside is that some
    redundant work is done.  (It's just another case of "Not Invented
    Here".)

(2) To use EXPRESS to say exactly what the nodes are, and which
    syntactic constructs they correspond to.  EXPRESS is a much
    bigger, much deeper language than the property set formalism, and
    my impression is that it will take only a minor amount of
    adaptation to use EXPRESS for this purpose.  I do not see EXPRESS
    as the solution of "least necessary complexity" for today's
    pressing problems, but I agree with Eliot that EXPRESS or
    something very much like it will ultimately take over for (and/or
    be merged with) the property set formalism.  I think the EXPRESS
    solution is an acceptable choice, because the only downside (if we
    can call it a downside) is that we all have to learn EXPRESS,
    which is a mighty interesting and extremely powerful schema
    language.  (I love the side effects of this particular choice.)

(3) To use some other existing formalism to say exactly what the nodes
    are, and which syntactic constructs they correspond to.  I don't
    know what the candidate formalisms might be.

(4) To formalize no coin, and to continue to have only the two sides.
    This would be to ignore the issue and, most likely, to let the
    biggest honking syndicate on the Internet force its interpretation
    (which it may attempt to keep secret) of what are the classes and
    properties of XML nodes on all of us.  The blackest scenario is a
    war of attrition during which addressing will remain unreliable,
    and after which we will have substantial amounts of legacy data
    that make a lot of off-by-n and other addressing errors.  This is
    an unacceptable and irresponsible choice.  Frankly, I don't see
    how the XLink/XPointers people could possibly ever finish their
    work without a formal model of parsed XML, so I see this as by far
    the least likely choice.

-Steve

--
Steven R. Newcomb, President, TechnoTeacher, Inc.
srn at techno.com  http://www.techno.com  ftp.techno.com

voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137)
fax    +1 972 994 0087 (at ISOGEN: +1 214 953 3152)

3615 Tanner Lane
Richardson, Texas 75082-2618 USA

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list