Namespaces hate validation!

Wed Jan 6 18:27:07 GMT 1999

Murray Maloney wrote:

> On what basis can you claim that IBTWSH is a *true namespace*
> while also stating that it does not use prefixes AND asserting
> that it is compatible with "Namespaces in XML"?

By convention, the IBTWSH parts of documents use the namespace-default
mechanism so that IBTWSH document fragments can be passed straight
to HTML renderers.  This is accomplished by giving the IBTWSH
container element (whose element type depends on the enclosing
DTD) an "xmlns" attribute with value "".

> Well, based on my own experience and discussions with some of
> the world's most prolific and well-respected DTD designers,
> I have to say that it is a lot harder to design a DTD that
> combines two DTDs, especially if you are compelled to use
> "Namespaces in XML".

So you think using a variant of the CALS table model in HTML,
or the HTML 4.0 table model in xmlspec, made things harder,
not easier, for the designer of those DTDs?

> Too true. The point is that there is no such thing in XML
> as a "global attribute". Therefore, there should not be
> "global attributes" in "Namespaces in XML".

Your reasoning is bad.  'There is no such thing as a namespace
in XML 1.0; therefore there should be no namespaces in "Namespaces
in XML"' would be equivalent.  Global attributes are a conventional
use of the legal character ":" in attribute names, that's all.

> Tim's algorithm should
> deal with them somehow. In other words, the algorithm is
> not complete in this respect (and others).

Tim's algorithm does deal with them.  Prefixes on global attributes
are handled just like prefixes on elements.  It's the *absence*
of a prefix that's handled differently for elements and attributes:
for elements, an absent prefix means "use the current URI mapped to the
null prefix"; for attributes, it means "use the URI of the current
element".

> Ok, so we never actually validate the received document instance.
> Instead, we end up validating a derived document.

Just so.  We preprocess the DTD of the namespaced document, and
we preprocess the instance, to create a derived DTD and a derived
instance for which ordinary SGML/XML validation will work.
(We also need some outside information to give us the
prefix-to-URI map for the DTD.)

I don't say this is clean, and I protested against it mightily.
But it does *work*.

> So, I have to run several processes on the document that
> I am trying to validate before I actually get to test it
> against a DTD?

Well, one process on the document and another on the DTD.

> And these processes, presumably, are guaranteed
> to produce a DTD against which the instance is valid, right?

Not at all.  DTD conversion does not look at the instance,
and instance conversion does not look at the DTD.  The two derived
entities are not joined together until the final (ordinary) validation
is done, which may succeed or fail.

The only thing necessary is to make sure that the two processes
generate the same unique prefixes.  This could be done, e.g.,
by generating a CRC of the URI, converting to hex, and using
*that* as the prefix.  The original prefixes appearing in the
document are *ex hypothesi* worth nothing.

> So I am never going to encounter a document that isn't valid
> -- ultimately -- within this process?

Not so.  The final validation may fail for the same reasons any
validation may fail.

> What's the point of that?

For one thing, a namespace-aware processor, given
the derived instance, will treat it exactly as it would treat the
original instance, provided that the processor does not depend
on the presence or absence of "xmlns:*" or "xmlns" attributes
in particular elements.  IOW, the element types and attribute
names will be resolved to the same (URI, localname) pairs.

> So, perhaps we should not design facilities for which it is
> so hard to write tools -- especially when those facilities
> break compatibility with XML validity.

It's not so hard; it's just that code to parse DTDs is currently
embedded in full XML parsers, and I would like it to be broken
out so that I can get access to DTD information.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan at ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)