XML vs the Dreaded Whitespace

David Megginson ak117 at freenet.carleton.ca
Thu Dec 11 11:42:29 GMT 1997


Peter Murray-Rust writes:

 > As a corollary: Is anyone testing the ESIS output of the current crop of
 > XML parsers (4 Java + nsgmls, I think)? Regardless of the whitespace model
 > or the value of xml:space they should all produce identical ESIS (right?)
 > If not, then one or more is wrong. And all applications should (IMO) be
 > prepared to work with ESIS which I think is isomorphous with a WF XML
 > document.

There are quite a few more XML parsers out there, including at least
one in TCL -- see 

  http://www.sil.org/sgml/XML.html#xmlSoftware

As for ESIS, there are some problems that we'd have to overcome first:

1) How should empty elements be represented?  Right now, Ælfred generates a
   startElement event immediately followed by an endElement event.

2) How should the XML declaration be represented?  Should it appear as
   a processing instruction, or should it be ignored?

3) How should space in element content be handled?  According to the
   spec, a DTD-aware parser should handle whitespace in element
   content differently from whitespace in mixed content (Ælfred just
   ignores whitespace in element content right now).

4) DTD-aware and non-DTD-aware parsers will handle whitespace in
   attribute values differently.  Non-DTD-aware parsers will treat all
   attributes as CDATA, but DTD-aware parsers will treat tokenised
   attributes specially, by stripping all leading an trailing
   whitespace, and normalising internal whitespace to single spaces.


All the best,


David

-- 
David Megginson                 ak117 at freenet.carleton.ca
Microstar Software Ltd.         dmeggins at microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list