XML vs the Dreaded Whitespace

David Megginson ak117 at freenet.carleton.ca
Thu Dec 11 11:42:29 GMT 1997

Peter Murray-Rust writes:

 > As a corollary: Is anyone testing the ESIS output of the current crop of
 > XML parsers (4 Java + nsgmls, I think)? Regardless of the whitespace model
 > or the value of xml:space they should all produce identical ESIS (right?)
 > If not, then one or more is wrong. And all applications should (IMO) be
 > prepared to work with ESIS which I think is isomorphous with a WF XML
 > document.

There are quite a few more XML parsers out there, including at least
one in TCL -- see 


As for ESIS, there are some problems that we'd have to overcome first:

1) How should empty elements be represented?  Right now, Ælfred generates a
   startElement event immediately followed by an endElement event.

2) How should the XML declaration be represented?  Should it appear as
   a processing instruction, or should it be ignored?

3) How should space in element content be handled?  According to the
   spec, a DTD-aware parser should handle whitespace in element
   content differently from whitespace in mixed content (Ælfred just
   ignores whitespace in element content right now).

4) DTD-aware and non-DTD-aware parsers will handle whitespace in
   attribute values differently.  Non-DTD-aware parsers will treat all
   attributes as CDATA, but DTD-aware parsers will treat tokenised
   attributes specially, by stripping all leading an trailing
   whitespace, and normalising internal whitespace to single spaces.

All the best,


David Megginson                 ak117 at freenet.carleton.ca
Microstar Software Ltd.         dmeggins at microstar.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

More information about the Xml-dev mailing list