Another look at namespaces

Tim Bray tbray at textuality.com
Thu Sep 16 18:37:10 BST 1999


At 08:30 AM 9/16/99 -0700, Andrew Layman wrote:
>With all due respect to Simon St. Laurent, I believe that Tim Berners-Lee
>was correctly precise when he wrote "the document corresponding to the
>namespace URI becomes
>the place where the namespace-author can put *definitive*
>information about the intent of the namespace."  

I and many others disagree, for reasons best expressed in Jon Bosak's
summary of the issues.  Among other things, I don't believe that most
interesting namespaces *have* definitive information, but have semantics
that are communicated via some messy combination of schemas, stylesheets,
prose documentation, and running code.

>While there are many processes that can be applied to a document, and
>correspondingly many specifications of those processes, there can be, for a
>given term in a namespace, at most one correct *definition*.  

I disagree, I think this is a strong and surprising claim, and I would
like to see some real-world supporting evidence.

>As background, and to help make clear some of the thinking on this subject,
>the following is the text of a message I posted a few weeks ago to the XML
>Plenary:

As further background, here is my follow-up, from the plenary:

At 04:01 PM 9/8/99 -0700, Andrew Layman wrote:
>Tim Bray wrote that, although a schema may be somehow associated with a
>namespace, the "meaning" of markup will be determined in a number of ways
>such as style sheets, or procedural code, or maybe the schema.  I believe
>this understates the importance of the schema.  A schema makes a
>contribution to the Infoset. It does this by providing default values and --
>under some recent proposals -- by indicating type information, which may be
>considered also a form of default value.  Elements defined by a schema, when
>used in an instance document in a validating processor, will have these
>default values available, and this fact is pertinent to the author of the
>document.  This means that an element is incompletely read if the schema for
>it is not read. 

Well, to some extent both Andrew and I are guilty of guessing-the-future,
because this discussion is only meaningful in the context of the existence
of a new, more ambitious, kind of schema.  Andrew's claim (I suspect he'd
agree) is certainly not true in the context of DTD's.

Now, let us remember that one of the big selling points of XML is that it 
allows you to process a document standalone without recourse to *any* 
external resources, and I suspect that this style of processing will 
remain very important regardless of how ambitious schemas get.  In 
particular, the class of XML apps that need to consult the DTD to do what 
they do is vanishingly small - the only ones I know of are authoring
facilities.

So, the prediction here that "an element is incompletely read if the schema
for it is not read" represents a HUGE shift in the central XML processing
paradigm.  Not that I'm claiming it's wrong - but it's also not self-
evidently true on the basis of the evidence before us.  

>Unlike the various processings that may or may not be applied to a document,
>a cited schema is part of the information conveyed by the document. When a
>namespace has an associated schema, that schema is part of the input to any
>further complete processing of the referencing document.  

This is an assertion with which I completely disagree, and which really
needs some supporting evidence.  Right at the moment the overwhelming 
paradigm is that once an application *recognizes* which tag-set is in play, 
most of the business logic is then more or less hard-wired into the 
software.  I think we can all agree that less hardwiring and more 
rule-driven processing is a good thing, but getting there from here is a 
huge task, and one in which the IT professions have been engaged throughout 
my embarrassingly lengthy career in this profession, and we're not there 
yet.  

"Any further complete processing?"  Wow.  That's a different universe.
Unless you mean something very strong by the word "complete" here that
would not be expected to apply in the majority of application scenarios.

>A schema is not on
>par with other forms of processing that might be specified, say by style
>sheets or procedural code. It is prior to other processing.

Once again, that's an assertion that is empirically not true in the context
of today's technology.  I accept the possibility that it might become
increasingly true a couple of technology generations from now - but it
is certainly not a foregone conclusion.  And I remain convinced that there
will be a hugely important class of lightweight performance-critical
applications that just do not have bandwidth to do schema processing
at run-time.

>For specific processing, to make use of a
>namespace one needs to know what names are in it, which are not, and what
>the meaning is for each name.  Such information may be obtained by various
>means such a by reading a schema, by reading a text document, or from
>conversations with other people, but in all cases the meaning is not created
>by the processing code, the meaning informed the programmer who wrote the
>code.  For a namespace to be reliably useful, there must be a document
>defining its contents and their meaning.  A schema is, for many namespaces,
>that document.

Agreed absolutely.  But the current flow is that the *programmer* comes
to understand the meaning, by reading the prose documentation and the
schema, and then encodes that knowledge, along with a bunch of contextually
dependent business rules, in software.  This paradigm actually works 
reasonably well at the moment.  

So concretely, what are we arguing over?  I *think* we're arguing whether
or not it's a good idea to declare that schemas are special enough
that it's a good idea to use a namespace URI to fetch them.  

My position is that it would indeed be desirable to be able to use the
namespace URI to fetch schemas, but that there are many other things 
the fetching of which is of at least comparable importance: human-readable
documentation, stylesheets, java classes, schemas in a different dialect.  
(For example, once XML schemas become available, will Microsoft tell all
its customers that are now using the variant-XML-data-schemas that come
with IE5 to discard them, and instantly drop support in the software?  I
doubt it.  What then does the namespace URI point at?)  To do these things 
effectively, we need a level of indirection and a place to put some metadata 
concerning what associated resources are available.  Which is why I keep 
coming back to the desirability of a packaging framework for all these 
things.  The development of which is in the continued-XML-work package
currently before the W3C membership.

Hmm, is it being proposed that the schema should do double duty as
a packaging document as well?  This is conceivable, we'd need to think
whether it represents optimal design, but there's nothing impossible
on the face of it. -Tim




xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list