Namespaces, Architectural Forms, and Sub-Documents

Paul Prescod papresco at
Wed Feb 4 18:05:50 GMT 1998

David Megginson wrote:
> It seems to me that when you want to embed large contiguous structures
> from different document types in an XML document, each different
> namespace should be its own sub-document, referenced as a binary
> entity (or using whatever other mechanisms are available in XML-Link).
> Good tools and protocols should make it possible to create, transmit,
> and process compound documents as if they were single files.  This
> will be necessary anyway for supporting multimedia.


Making my five-line formula into a different document with a different
document type is *not easy*. It is a royal pain in the butt, which is
why almost nobody does it. I have seen the CALS table model merged with
dozens of DTDs and have never once seen someone take the opposite
approach of making CALS tables "subdocuments."

We can imagine a theoretical universe in which the tools are so good
that this is easy, but if we are imaginative in this way, we can paper
over any design flaw in SGML or XML with the claim that "the tools can
handle it." If XML or SGML were designed to be manipulated only through
tools, that would be acceptable. But they were not...they were designed
to be written in text editors and surprising enough, a huge number of
people do that.

> Here are some general guidelines:
> * Architectural forms are most suitable for applications where
>   multiple inheritance is required, or where elements belonging to a
>   different document type are scattered throughout a document.

I agree with the former. I don't with the latter. A simple modules
proposal handles the latter nicely.

> * Sub-documents are most suitable for applications where all of the
>   element belonging to a different document type are rooted in a
>   single subtree.

Subdocuments have many problems including 
 * typing convenience (seperate files...yuck)
 * element type constrainability (how do I specify a SUBDOC root element
type in a content model?)
 * "content model communication" (how do I pass a %cell; content model
into my table subdoc)
 * modularity (subdocs must be declared at the top of the document, an
annoying non-local maintenance issue)
 * ID linkage (even for simple links I must use some more advanced
linking strategy)
 * semantics (i.e. SUBDOC has need VALUEREF or something else
on top of subdoc)

That does not mean that they are never useful. There are some hard
problems where they are very useful. But for the *simple problem* of
embedding MATH in HTML (for example) they are overkill, as are
architectural forms. *KEEP SIMPLE THINGS SIMPLE*

> "namespace:gi" element type names are unsuitable for several reasons:
> 1) The complexity of namespaces is exposed to the author rather than
>    hidden in the DTD (as it is, optionally, with architectural forms).

As my paper pointed out, we now live in a universe where the person
creating the DTD is often the author. You live in a world where people
pay you to hide things in DTDs. Most of the people on the Web don't have
a David Megginson or a Paul Prescod to do that for them. Their problems
are still real.

> 2) Multiple inheritance is not possible (X can be a kind of Y or a
>    kind of Z, but not both).

Many people do not want multiple inheritance and as my paper pointed
out, it makes some problems much more difficult to understand and solve.

> 3) Standard DTD-based validation is not possible, and it is more
>    difficult to create DTD-driven authoring tools.

I think you are totally wrong here. As a programmer, I could implement
modules in an SGML editor in MUCH less time than it would take me to
implement architectural forms.

> 4) Both architectural forms and sub-documents can be fully supported
>    under the existing spec by _both_ validating and non-validating XML
>    parsers: no changes necessary.  Furthermore, they will also remain
>    compatible with SGML tools.

That's great for today. But for tomorrow, ISO has already undertaken to
change SGML. Do you propose that they should not add anything to SGML
that is not compatible with existing tools? My position is that the very
point of a revision is to make things easier and more powerful and that
this is thus the perfect opportunity to make this common problem easier
to solve, even if it breaks some old tools.

> Why are people worried about writing specs to solve a problem that
> already has good, working, available solutions?

Because the good, working solutions are solutions to much harder
problems and make simple jobs needlessly difficult. 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list