XML-Data: advantages over DTD syntax?

Mon Sep 29 21:17:13 BST 1997

At 11:46 AM 9/29/97 -0700, Derek Denny-Brown wrote:

>>specifically say that a duck does certain things differently), and (3)
>>references to generic animals can also point to ducks.  To put this in
>>traditional OO terms, Duck inherits data, behavior, and type from Animal. In
>>SGML, it can't inherit behavior, but it can inherit data and type.
>>[snip]
>
>One thing which Henry Thompson's presentation at HyTime '97 brought forth
>in my mind was SGML's lack of support for (3) above.  Architectural forms
>do little or nothing to rectify this, although AF could provide a solution
>if used in an envirnment which supports simultanious view of the source and
>AF instances with links between the two.  

I'm not sure I follow you.  If you have an architecture-aware search
engine, then you should be able to do a query of the form "find all
elements derived from the form 'animal'", which will include both 'animal'
elements and 'duck' elements.  How is this not 3?  Or do I misunderstand
Henry's requirement?

Something in the system has to know that a duck is a kind of
animal--architectures convey this information as clearly as any other
method, so I don't see how they can't satisfy the requirement.

>                                          Part of the problem is that AF's
>do little, if anything to make life easier when I want to build a DTD which
>extends an existing DTD.  I have to copy the existing DTD and modify it and
>then add the AF meta-info which maps the new DTD back tot he old.  But now
>I have a completely different DTD, from the point of view of _all_ existing
>SGML software.  Sure I can map my documents to the original, but I can not
>see it as both... I must either remove all value added by my modified DTD,
>or abandon existing options based on the original DTD, since the new
>document is not conforming to the original DTD.  Obviously, since I put the
>time into building the new DTD, I think there is some significant value
>added, but I can not leverage the value added while at the same time
>leveraging the use of the existing DTD as a base architecture.

Again, I don't follow you.  Either you really have a completely new DTD and
you have to define the processing for it completely or you have a DTD
derived from an architecture *and* you have architecture-aware processors
that let you apply the architeture-specific processing to your new
documents, leaving only the new stuff to be defined.  How do architectures
not do this? How would the XML-Data proposal do this any better? In both
cases, it's a function of the processing code both providing the methods
for the base classes and the processing system understanding the derivation
hierarchy.

You can also use the trick of defining the architecture such that its
declarations (and in particular, the parameter entities used to configure
and modularize it) can be also used to create declarations for documents
derived from the architecture.  In essessence you combine architectural
derivation with the sort of clever modularization typified by the TEI and
Docbook declaration sets.

Your comments suggest that you are confusing *parsing* with *processing*.
Parsing is not an issue, because the document is either valid to its DTD or
it isn't, and is either valid with respect the governing schema or isn't.
Whether or not the document is valid doesn't affect how it is *processed*
after parsing, which is purely a function of methods applied to types, not
parsing, and is entirely independent of how the type information got
associated with the data (whether by the architecture syntax or the
interpretation of some XML-Data document).

>This is exactly what OO Inheritance allows a programmer to do.  You need
>an extra attribute? Easy!  With AF's I either see the document as the new
>DTD or I can not see the attribute... value lost either way.

This is only true if you define your processing in terms of architectural
instances derived from documents, but clearly, that is not the way
architectures are intended to be used in the general case.  The
architecture provides part of the processing and an architecture-aware
processor must be able to associate architecture-specific processing with a
document, but it's not an all-or-nothing proposition.  I must always be
aware of the document's architectural nature as well as its base nature
unless the only processing I care about at the moment is that defined by
the architecture.

The XML-Data proposal (to the degree I understand it) and architectures
appear to convey exactly the same information about a schema and a
document's derivation from it.  The fact that the XML-Data syntax appears
to be more "object-oriented" must be a red herring because in both cases
you are providing a purely declarative data description, not the definition
of active methods.  The only way in which XML-Data might appear to be
object-oriented is XML-Data-specific semantics for generating complete
declarations from XML-Data specifications based on implication rules, but
these will either be effectively identical to features in the AFDR syntax,
such as multiple attlists for the same element type, or facilities of
limited utility, such as content model implication (which can be managed
pretty well with parameter entities).  In other words, I don't see that
it's possible for anything like XML-Data to provide significantly more
assistance in creating and managing declaration sets and meta-DTDs than you
already get with the AFDR and normal SGML facilities.

This is why confusing architectures with object-oriented programming
approaches is so dangerous: they are not the same thing and thinking that
they are leads to erroneous conclusions and unrealistic expectations (such
as that content models can be somehow inherited in any but the most trivial
ways).

Note too that when you have DTD-less documents, problems of DTD syntax
munging go away because you don't have any DTD syntax to mung.  Any munging
is managed by the creators of derived schemas.  This is one of the beauties
of XML--it frees us from the need to conflat schema definition with the
definition of the parsing rules for document instances.  

Cheers,

E.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)