"Empty" Text Nodes

John Cowan cowan at locke.ccil.org
Wed Feb 24 17:46:17 GMT 1999

Arkin wrote:

> 1. An HTML processor is a very specific case of an XML processor

As of now, HTML is not XML at all.

> 2. PRE, STYLE and SCRIPT are specific cases in HTML, unlike other
> elements.

STYLE and SCRIPT are so-called "CDATA content elements" (for which
there is no XML equivalent; the term "CDATA" here is not synonymous
with "CDATA" as an attribute type, or with CDATA sections).
They are terminated by the sequence "</", which must be the
beginning of the matching end tag.

PRE preserves whitespace but is an ordinary element; elements in its
content (like A) are processed, and < and & must be escaped with

In SGML (including HTML, but excluding XML), up to one newline each is
removed from the beginning and the end of a run of character data.

> 4. With a validating XML processor, XML elements should preserve
> whitespaces only if the 'xml:space' attribute has a value of 'preserve',
> otherwise they may lose whitespaces by ignoring the trailing and leading
> whitespaces and consolidating multiple whitespaces to a single space
> (&#20;). Again, whitespace is assumed to be for human readbility.

This behavior is performed by the application: a conforming processor
may not do it.  In attribute values, OTOH, a conforming processor
must do it for attributes that are not CDATA.

> 5. With a validating XML processor, XML elements that have non-mixed
> content type (only elements, no text) should ignore all whitespaces and
> flag an error for any other text that appears in between elements.

XML processors cannot just ignore that whitespace: they must report
it to the application, which is then free to ignore it and typically

> 6. Without a validating XML processor, XML elements should attempt to
> ignore as much whitespace as possible, regarding it as human readable
> whitespace.

That totally depends on the application.
John Cowan	http://www.ccil.org/~cowan		cowan at ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

More information about the Xml-dev mailing list