"Empty" Text Nodes
cowan at locke.ccil.org
Wed Feb 24 17:46:17 GMT 1999
> 1. An HTML processor is a very specific case of an XML processor
As of now, HTML is not XML at all.
> 2. PRE, STYLE and SCRIPT are specific cases in HTML, unlike other
STYLE and SCRIPT are so-called "CDATA content elements" (for which
there is no XML equivalent; the term "CDATA" here is not synonymous
with "CDATA" as an attribute type, or with CDATA sections).
They are terminated by the sequence "</", which must be the
beginning of the matching end tag.
PRE preserves whitespace but is an ordinary element; elements in its
content (like A) are processed, and < and & must be escaped with
In SGML (including HTML, but excluding XML), up to one newline each is
removed from the beginning and the end of a run of character data.
> 4. With a validating XML processor, XML elements should preserve
> whitespaces only if the 'xml:space' attribute has a value of 'preserve',
> otherwise they may lose whitespaces by ignoring the trailing and leading
> whitespaces and consolidating multiple whitespaces to a single space
> (). Again, whitespace is assumed to be for human readbility.
This behavior is performed by the application: a conforming processor
may not do it. In attribute values, OTOH, a conforming processor
must do it for attributes that are not CDATA.
> 5. With a validating XML processor, XML elements that have non-mixed
> content type (only elements, no text) should ignore all whitespaces and
> flag an error for any other text that appears in between elements.
XML processors cannot just ignore that whitespace: they must report
it to the application, which is then free to ignore it and typically
> 6. Without a validating XML processor, XML elements should attempt to
> ignore as much whitespace as possible, regarding it as human readable
That totally depends on the application.
John Cowan http://www.ccil.org/~cowan cowan at ccil.org
You tollerday donsk? N. You tolkatiff scowegian? Nn.
You spigotty anglease? Nnn. You phonio saxo? Nnnn.
Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5)
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev