Another look at namespaces

Wed Sep 15 22:39:00 BST 1999

Paul Prescod (paul at prescod.net) on Mon, 30 Aug 1999 08:29:53 -0400 said,

> There are two separate issues here.

Yes.

> #1. In 1999 xHTML can either have three namespaces or one. If it has
> three then there will be a one-to-one relationship between namespaces
> and grammars so that a programmer can know the grammar of the input
> data. Programmers that want to treat the three as one will have to do so
> by issuing some command to their namespace processor or (better) by
> using a namespace processor that recognizes an embedded instruction.

[...]

> I think that people are really concerned more about the precedent than
> today's issue.

Yes. It is a question of how namespaces are used.

I have no passionate personal concern as to whether XHTML spec defines 3
languages
or one, but the HTML 4.0 spec defined three, and XHTML was required to be
as direct a mapping from HTML 4.0 to XML as possible.  You can dispute
(and many do) whether defining both "strict""and "transitional" is useful.
(where by "strict" read 'http://www.w3.org/TR/.../strict" or whatever the
URI for strict namespace is)

Aside from that discussion, let's just use Strict and Transitional as
examples
if one does have two languages.

It is important to realize that these are *different* languages.
If you take a Transition document and re-label it as a strict document
it can be invalid. Invalid by specification, whether represented by English
or DTD or schema.

But equally important is what folks are getting at I think when they say
they are the "same"
language: that Strict is a *subset* language of Traditional.  That means,
that if you take
any valid Strict document, and transform is by changing nothing except the
namespace URI
from Strict to Transitional, then it *will* always be a valid document in
every sense.

This applies whether you define validity using just a spec or with the help
of a DTD or schema.

> #2. In 2000, 2001, 2002, etc. there will be new versions of XHTML. Some
> (probably all) of these will be backwards incompatible as every version
> of HTML has been backwards incompatible: a document conforming to the
> new vocabulary/grammar can break code expecting the old
> vocabulary/grammar.

Yes, absolutely -
leaving aside again the question of XHTML itself and treating it as an
example.

> It is *vital* that a) there be a way to announce this
> backwards-incompatibility and b) there be an infrastructure that allows
> a mapping from new to old. The namespace is the obvious way to do the
> former. We have no good mechanism for the latter.
>
> As I've said, this is also necessary for e-commerce and every other XML
> application.

Yes.

> If we develop this mechanism now then the first wave of XHTML software
> will be automatically ready for XHTML 2.0 (not to mention e-commerce). I
> can understand the wish to delay the problem but it just means that we
> cause a train wreck later on. I am deathly afraid, however, that if we
> set a precedent of pretending that these three variants are "one
> language" we will continue down that path as we develop more and more
> incompatible new versions.

Yes.  We cannot afford to be fuzzy. We must be able to use compatibility
where it exists (easily and simply) but absolutely not assume it where we
have no reason to.

This principle of mandatory extension is something which
we tried for years to get into HTTP because we needed it for putting in new
mandatory features. In HTML originally I specifically stated that any
unknown
tag must be processed as though it was replaced by its content.
In other words,   For all x not in HTML,

                    a<x>b</x>c --> a b c
(1)

preserved HTML-validity.   That was a good thing and a bad thing: it
allowed
experiments galore.  It also alas allowed total ambiguity to arise when e.g.
<table> was introduced by Raggett and Netscape and others at the same
time.   Namespaces fix that directly.   It allowed, equally seriously,
no way of saying "If you don't understand this feature then you cannot
process this document.  This is a very important requirement for any
language IMHO.

In this case, the only information which a Transitional-capable receiver
of information needs to be able to process a Strict document is

                     "Strict is-a-subset-of Transition"

This gem could be
- programmed into a xHTML-specific application or
- picked up from a schema or
- picked up from the document itself

The latter case is useful where you have an in-house subset of
a language but you want generic browsers to know that the
language can be processed as xHTML.

As you say, this is really important for version change.
"v2 is a subset of v1" is a rare case which any v2 document
can be interpreted as a v1 document.

What happens when a version one program meets a version 2 document?

a). Halt. "Don't understand".  This happens with typical word processors.
   sometimes this is the behavior you need.
b) Know that it is a subset of what you know, and process it.
c).  Parse it and ignore new features. This can be done by
  using a separate namespace for the new features, and by
  saying in some way (which we really need) that it is ok
  to ignore (replace with contents, or remove: state which)
  elements in that namespace for the purposes of that document.
d)  Convert it into a v1 document.

[(b) and (c) are actually a special cases of (d)].
In the last case, the application only innately "understands" the v1
language, but the schema includes [pointers to] rules which
allow a valid v1 document to be deduced from the valid v2
document.  These are rules e.g. of  the form (1), or
a set of "Optional/mandatory" flags for the elements.
Or maybe an XSLT mapping.  Or a set of inference rules
for RDF.

(d) requires that the application-specific processor have
at its disposal a nonspecific translation processor of some
sort.

I would like to see a simple way of defining subset languages
introduced with all speed, to allow (a) and (b) for all
XML applications.  I'd like to see experiments with (c) and (d)
too, in particular an early way to define optionalness of
namespaces in XML in a particular document.

Underlying this is a philosophy about the meaning of a document.
As I see it,  a document is from the author to the reader, and its job
  is to convey the intent of the author. When an author uses a
defined language, then the author refers to the language-author's
definition of terms in the language. When I write you an XML
document, then what I mean is what I say, interpreted according
to the specifications of the language - the namespace - I use.

Of course on the web we don't expect machines to "understand"
information in an AI way. We use a version of "understand"
which means to be able to translate it into something which it
has innately been programmed to process, while preserving
in that translation the intent of the author.  This may seem a heavy way of
defining HTML
but think about the international exchange of invoices.
The web of meaning between languages is defined by the
set of these transformations.

The namespaces spec was adamant that you could use namespaces
without having to dereference the namespace URI.
However, as we define languages for talking about languages
(XML and RDF schemas for example, even style sheets)
the document corresponding to the namespace URI becomes
the place where the namespace-author can put *definitive*
information about the intent of the namespace.
And this is not mandatory - but is very useful!

For example, you can run an xHTML document though any
DTD you like if it suits your purposes, but if you want to
check whether it is valid xHTML then you should use
the xHTML schema which corresponds to the namespace URI.
You might of course have a local copy and not actually
need to go onto the net to us live HTTP.

The exciting use of namespaces in schemas will be when we
have schema documents which contain the syntactic
constraints in xml-schema language, but themselves
use other namespaces to come to express the
entity-relationship model of data (rdf-schema),
legally appropriate presentation (link to style sheet),
financial implication (link to mapping to FSTC
echecks or whatever).  The functionality which we
will be able to build into new languages will grow
as  the richness of languages to which we can map.

>  Paul Prescod

Tim Berners-Lee
not in any official capacity.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)