david-b at pacbell.net
Wed Sep 1 18:47:58 BST 1999
Paul Prescod wrote:
> David Brownell wrote:
> > > Second, note that an XHTML browser *does* need to worry about what
> > > variant of HTML was used. The browser must decide which of its implicit
> > > stylesheets to apply. Each stylesheet has hard-coded knowledge of
> > > content models embedded within it. With HTML this is not a big deal
> > > because we have become used to presuming that all HTML will be "loose".
> > Perhaps you could clarify this for me: why would an XHTML content
> > model imply a stylsheet? The need doesn't naturally follow; maybe I
> > missed something in one of the hundreds of earlier messages! Code
> > handling the "frameset" model handles "transitional" and "strict" as
> > simple subsets -- right?
> But not vice versa. A program written for strict will not work with a
> frameset or transitional document.
There's a notion lurking here that I've not heard before. Namely
that this artifact (too many namespace URIs :-) may be a nudge to
actively discourage people from writing software that handles any
part of XHTML other than the "strict" bits.
Otherwise, people would normally do what they do now: build software
that accepts the whole "frameset" package. After all, most HTML out
there is close to that ruleset ... it's got frames, HTML-3.2isms,
and so on. It's the "strict" HTML-4.0isms (thead, tbody, etc) that
have negligible real-world support.
(Yes, the need for everyone to _speculate_ about why the heck the
W3C suddenly did a volte-face on this issue is galling. You'd
really expect that such a change would have a visible rationale.)
> If the validation layer sees two things as distinct then any layer built
> on top of the validation layer should continue to see them as distinct.
> To me that is a no-brainer. In this case the validation layer sees
> htmlloose:P and htmlstrict:P as completely different documents. To the
> validator they are as distinct as Docbook:article and xsl:stylesheet!
I think other people disagreed with this too. The XML 1.0 spec says
that validation only causes reporting of "validity" errors, and does
not (to my reading) imply any "seeing as distinct". (That gets into
those discussions of identity again -- let's avoid learning how fine
it's possible to slice distinctions ... ;-)
So, to me it's a no-brainer that validation is only a pass/fail, and
is otherwise invisible. If I have some markup that follows the HTML
"strict" rules, it validates against all three DTDs ... it really
doesn't matter which one the application thinks was used. It passes
validation by all three rulesets, and that's all that matters.
> > > In general, as a model, we should adhere to your model strictly: the
> > > browser should see only level 2. Level 2 validation should be driven by
> > > the unique names in level 1.
> > Now you're bringing validation into this.
> That's because validation is vital. It's central to the whole thing. An
> application author can only write code that assumes the data has been
> validated and has passed. If you presume that any mix of HTML tags is
> equally likely to appear in the input then you are back to the legacy
> HTML tag soup.
I didn't assume "tag soup" at all -- I was commenting on the way you were
bringing in the notion, namely assuming clients did it. Which appears
not to have been what you intended to imply.
To the extent that I assumed anything, it was what we know from the
(X)HTML specification: everything at least conforms to the "frameset"
rules. That's a clearly defined set of rules, and many applications
know how to deal with it.
> Therefore if the validator sees two elements as distinct then the
> application needs to know that so it can know what input structure it
> should expect.
But validating processors just report more errors than non-validating
ones, so I don't see a place for this "sees two elements as distinct"
notion you keep repeating.
> > I really don't see why the parse tree would need to expose content
> > models, particularly in the case of the render-only application you
> > describe.
> How can you write a stylesheet that makes no assumptions about the
> structure and hierarchy of the data? It's impossible.
Go back to the example I keep using: the stylesheet knows the "frameset"
rules. It has _that_ set of assumptions about structure and hierarchy.
The "frameset" rules match most folks' understanding of what HTML is,
in any case -- forcing a lesser set of rules would be confusing.
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev