schema v. validation spec, content models considered bad (was Re: XHTML & Schemas)

Rick Jelliffe ricko at allette.com.au
Fri Sep 3 07:14:13 BST 1999


From: David Megginson <david at megginson.com>

>I know that it's the wrong thing to have the document point to its
>stylesheet, and right now, I'm leaning towards considering a schema as
>a specialized stylesheet (or a stylesheet as a kind of schema).

This comes down to the difference between a schema and a validation
specification.

The latter is a function that is applied to a document: some useful
kinds of validation languages can be implemented using stylesheet
languages.

A schema language can be more than a validation specification: notably
it can give rules for building new documents.  These rules could
potentially involve many more issues; for example, validation results
are sometimes regarded as returning a truth value (in fact, in real life
we see that validators return information about errant tags, so
validation in practise is not a mere function returning a truth value)
but a schema languag could include more information about how to build
the document: for example, that certain subelements should be
automatically inserted by a editing tool when the user inserts an
element.  This kind of building-schema may be closer to what a forms
language does, but it is schematic information that does not have
anything to with validation.

(I think my email on this was lost this week so I'll repeat:)  this
XHTML namespace issue is exactly the same one that I was trying to point
out in that "XML Namespaces are dead"  thread.  That to tie a namespace
to a schema puts the cart before the horse.  A name is different from a
use of the name; a schema is just one use of a name. On the WWW, the
application with the highest claim to attach a schema to a namespace is
the browser and not a schema tool. For new, independent, controlled
schemas, this is not so critical; but the more language has variants and
subsets and additional requirements imposed by various implemention, the
less a namespace URI can invoke a particular schema.

When we have an evolving family of variants (both the DTDs of HTML and
the implicit DTDs of each of the tools that accept HTML documents)  it
is an incorrect analysis to think that these DTDs (or the namespace URIs
strict, transitional, etc) represent schemas in the general sense. They
merely represent the tightest validation specifications possible for the
document. They do not show what schema the user had in mind when
creating a document, or what schema future users should use; look at the
"tidy" tool: it makes this process explicit...the DTD given is not the
schema used to build but merely the strictest possible validation
specification.

There is another issue here too, b.t.w.   That is that a lot of this
problem comes from the idea that a schema involves a content model: that
the parent determines the children.  A "parent model" paradigm would
resolve this issue to a large extent; SGML/XML is too reliant on simple
automata/grammar theory in this.  The "content model" works against
extensibility; a "parent model" allows extensibility: it would say "the
element myhtml:blink is allowed in html:p elements" without having to
rewrite the DTD for html:p.

In other words, IMHO some of the namespace problems that we can see with
XHTML flow from the  content-model paradigm of DTDs (and XML Schemas).
Many people have found DTDs inappropriate for some kinds of jobs (Tim
Bray has been very consistent in commenting on this) but the schema
proposals have missed the mark: the problem will not be relieved by
using instance syntax or sugar-coating architectural forms (as
architypes) or adding inheritence or open/closed content models
(excellent though all these may be).  Extensibility requires that
sometimes a child must be able to choose its parent, just as much as a
parent can choose its child.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list