Layers, again (was Re: fixing (just) namespaces and validation)

Simon St.Laurent simonstl at simonstl.com
Wed Sep 8 17:04:20 BST 1999


Amazing what proposing to fix namespaces on this list can bring out.
David's presented an astounding set of assumptions that I've never heard
voiced so explicitly before.  In many ways, however, I think these
assumptions underlie many of the _problems_ in XML, not the promise of XML.

This set of assumptions also contradicts the quote I began my proposal with
from James Clark's XML Namespaces document
(http://www.jclark.com/xml/xmlns.htm), which complains about missing parts
in DTDs rather than any layering problems. It also doesn't match the Note
Web Architecture: Extensible Languages
(http://www.w3.org/TR/NOTE-webarch-extlang), and only sort of fits the
picture (not so much the text) in the Note RDF Architecture
(http://www.w3.org/TR/NOTE-rdfarch). 

At 08:31 AM 9/8/99 -0400, David Megginson wrote:
>David Carlisle writes:
>
> > Yes, agreed, it wasn't really a criticism. The fact remains that at
> > the current time the `problem' is that there is no standard way of
> > getting from one layer to the other.
>
>Sure there is -- at least, the Namespaces REC defines pretty clearly
>what Namespace declarations and prefixes do.

I think you've read considerably more into the Namespaces REC than I as far
as _when_ that 'namespace doing' takes place.  While it does discuss the
appearance of qualified names in DTDs and makes certain comments regarding
the non-reliability of attribute defaulting in non-validating parsers, it
doesn't go further.  It doesn't specify explicitly that Namespace
processing is performed as a layer between the application and the parser,
or that all parser operation must be completed before namespace processing
begins.

More on this below, as we get into the details of this 'layering'.

> > That is, if I have a namespace aware application that really
> > doesn't mind what prefix is used in a document instance, there is
> > no convenient standard way of supplying a DTD against which a set
> > of documents to be used with that application may be validated.
>
>But that's not a problem of getting from one layer to another; it's
>simply a problem of applying an operation to a layer.  Here's one
>layered view:
>
>
>Layer 1: octets
>Validate with: (custom code)
>
>Layer 2: Unicode characters
>Validate with: (regular expression)
>
>Layer 3: XML
>Validate with: DTD
>
>Layer 4: Namespaces
>Validate with: (XML Schemas, eventually)
>
>Layer 5: RDF
>Validate with: RDF schema
>
>Layer 6: Application
>Validate with: (local business rules)
>
>
>Here's another layered view:
>
>
>Layer 1: octets
>Validate with: (custom code)
>
>Layer 2: Unicode characters
>Validate with: (regular expression)
>
>Layer 3: XML
>Validate with: DTD
>
>Layer 4: Namespaces
>Validate with: (XML Schemas, eventually)
>
>Layer 5: XHTML
>Validate with: (built-in XHTML processing rules)
>
>Layer 6: Application
>Validate with: (local business rules)
>
>
>My applications have no problem at all getting from layer 3 to layer 4
>in either example, because the path is fairly well defined; it just
>happens that there is also a convenient schema formats for applying
>structural validation or guided authoring to layer 3, but that's a
>separate operation applied to the layer, not part of the layer
>itself.  Many layers do not have a standard validation technique yet.

The problem in both of these examples is that you treat XML itself as
monolithic, and DTD validation as a tool that can only be used at the time
of parsing.  As a result, we have multiple levels of checking that have to
be redundant if they're done at all.  Check against schemas, DTDs, _and_
RDF? And then throw application rules on top of that? Forget it.  These
'layers' are pretty much a guarantee that developers either need to make an
investment in large quantities of aspirin - or pick one tool and stick to it.

If I thought that schemas would be here soon, or that RDF really was the
answer to all of these, I wouldn't be pushing on DTD validation.  DTDs do
seem to be a good answer - in the short term for many projects, in the long
term for a subset of projects - to the need for structural checking.  It
doesn't seem that ridiculous to  want to 'validate' the results of a
transformation (generated via XSL or the DOM) or to want to 'validate' a
document against a DTD structure while taking into account namespaces.

Because XML 1.0 was written so that everything from character checking to
entity replacement to attribute defaulting to structural inspections (DTD
and otherwise) are all performed by one monolithic 'parser', we haven't
been able to describe XML processors with any level of granularity.  When I
talk about layers (for instance, in
http://www.simonstl.com/articles/layering/layered.htm), it's layering for
the sake of breaking things into the smallest usable components, not for
the sake of piling on more and more mostly redundant processing.  Your
layer 3 is way too thick.

If treat validation as a process with its own life, outside of the Rube
Goldberg machine known as an XML processor, we might be able to solve a lot
of problems that currently look very difficult much more simply.
Namespaces included.

Simon St.Laurent
XML: A Primer (2nd Ed - September)
Building XML Applications
Inside XML DTDs: Scientific and Technical
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list