Namespaces, Architectural Forms, and Sub-Documents

David Megginson ak117 at
Wed Feb 4 22:52:40 GMT 1998

Paul Prescod writes:

 > Making my five-line formula into a different document with a different
 > document type is *not easy*. It is a royal pain in the butt, which is
 > why almost nobody does it. I have seen the CALS table model merged with
 > dozens of DTDs and have never once seen someone take the opposite
 > approach of making CALS tables "subdocuments."

You have stated a good, general rule of thumb; in this case, however,
it is important to remember that a central component of simplicity is
consistency (by the way, I _have_ seen CALS tables as SGML
subdocuments, but one of my dreams in XML is never to hear the words
"CALS table model" again).

XML documents may (and perhaps, usually will) contain non-XML objects
such as wordprocessor documents, spreadsheets, MPEG clips, Java
applets, audio sequences, and many others -- to date, thankfully, no
one has proposed uuencoding any these and dumping them inline between
a start and and tag.  

Why should we treat an equation marked up in XML differently than an
equation marked up in Microsoft Word?  It seems easier (from a user's
perspective) to treat everything as objects, rather than defining one
special case.  Object-oriented programming has proven the value of
encapsulation, and the compound-document idiom is standard on millions
of desktops already, so we can hardly argue that subdocuments are an
unfamiliar approach.

I am a big fan of pragmatism on the implementation side, as people
might have noticed from my postings on the design of AElfred; on the
standards side, though, I wouldn't want to cripple a spec just to work
around a temporary problem that will have to be solved anyway for
non-XML objects.  SGML people will remember unfortunate features like
SHORTREF, DATATAG, and OMITTAG -- included a little over a decade ago,
likewise, for the sake of making things easy and working around
temporary deficiencies in the available tools.  XML is popular mainly
because it has finally banned all of these.

 > Subdocuments have many problems including 
 >  * typing convenience (seperate files...yuck)

(See comments above).

 >  * element type constrainability (how do I specify a SUBDOC root element
 > type in a content model?)

Use HyTime (just joking).  Seriously, I cannot see that this is a
worse case than not being able to use a DTD at all.  The general idea
of compound documents (Netscape with plug-ins, OLE documents, Andrew
documents, or otherwise) is that you can plug in any object -- I had
imagined that this was the goal of namespaces as well.  In XML you can
constrain the placement of pointers to external objects, at least.

 >  * "content model communication" (how do I pass a %cell; content model
 > into my table subdoc)

You're thinking of CALS here.  I'd suggest that we move away from the
older SGML model of heavily parameterised DTDs (as from heavily
#IFDEF'ed C header files): remember that one of the arguments for the
namespace model is to reuse stylesheets and other processing
specifications -- if a table model can vary its content unpredictably,
then you will not be able to reuse stylesheets anyway.  Again,
encapsulation is a big win, and it keeps things easy.

That said, if you _really_ need to pass a %cell; content model to a
subdocument, you can always include the same file of entity
declarations in both the parent and the child.  I'd recommend against
it, but it's possible if you want to do it.

 >  * modularity (subdocs must be declared at the top of the document, an
 > annoying non-local maintenance issue)

Only if you use an entity/notation mechanism.  You could just as
easily use a URL/MIME approach:

  <object url="formula1.xml" mime="text/xml"/>

The question of how to include external objects is a separate debate,
and subdocuments can swing easily from either vine.

 >  * ID linkage (even for simple links I must use some more advanced
 > linking strategy)

HREFs would work fine -- HTML people are already used to

  <a href="book.html#chapter3">

so we should have no confusion here.  Furthermore, you have the
advantage that your document's validity does not depend on its child
objects (this is very important for document management in large,
multi-author systems -- if subdocuments are atomic, then a change by
one author to a table, for example, will not make the containing
chapter invalid).  Again, as in programming, encapsulation will be a
big win in the medium term.

 >  * semantics (i.e. SUBDOC has need VALUEREF or something else
 > on top of subdoc)

I expect that XLL will provide mechanisms for expressing the 'embed'

All the best,


David Megginson                 ak117 at
Microstar Software Ltd.         dmeggins at

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list