ATTN: Please comment on XHTML (before it's too late)

Paul Prescod paul at prescod.net
Sun Aug 29 21:26:46 BST 1999


David Megginson wrote:
> 
> For those of you who haven't noticed, XHTML has gone to Proposed
> Recommendation (PR) status at the W3C:
> 
>   http://www.w3.org/TR/xhtml1
> 
> Unlike the last XHTML Working Draft, this PR has reverted to defining
> *three* separate XHTML Namespace URIs (transitional, strict, and
> frameset) with the threat of more HTML Namespaces in the future.

I've changed my mind on this issue since this morning (good thing I
didn't post then).

Names name things. The namespaces spec goes out of its way to be vague
about the exact nature of the referents (which may or may not be a good
thing). But someone, somewhere, must decide what the referents really
are at least for some particular problem domain.

--

In XHTML the referents are conceptual objects known as element types.
Element types have the following properties:

 * names
 * lists of attributes and attribute types
 * content models
 * semantics

It is extremely rare that you can make a stylesheet, query, computer
program or other process without caring about all four of these things. 
If you make a stylesheet based on the HTML 4.0 strict DTD and HTML 5.0
strict allows a different content model then your stylesheet may very
well crash. HTML strict documents may not be compatible with your HTML
loose stylesheet. For instance, if your stylesheet is not explicitly 

We all intuitively know that for most purposes HTML 4.0 strict and HTML
4.0 loose are the same thing. This can be inferred in two ways:

 #1. The stylesheet (/program/query/...) creator might "just know" that
their stylesheet works for both. If they "just know" then there should
be some way for them to state that knowledge in their stylesheet
(/program/query/...).

 #2. It might be a universal invariant that every document that
conforms  to HTML 4.0 strict also conforms to HTML 4.0 loose. In this
case it can be safely "casted" as you might cast an unsigned integer to
a signed integer or an integer to a float. 

The right place to state this second type of knowledge is not clear to
me -- in the schema? In some sort of namespace declaration? In a shared
database? In the schema seems simplest...

There is a third one but we are still a ways from implementing it: it
might be known that there is a *transformation* that can turn every
document conforming to HTML loose into a reasonable HTML strict document
(e.g. wrap all text nodes in <P>nodes). Then for all intents and
purposes HTML loose is "as good as" HTML strict from a programs point of
view.

 Paul Prescod

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list