XML vs the Dreaded Whitespace
David Megginson
ak117 at freenet.carleton.ca
Thu Dec 11 11:42:29 GMT 1997
Peter Murray-Rust writes:
> As a corollary: Is anyone testing the ESIS output of the current crop of
> XML parsers (4 Java + nsgmls, I think)? Regardless of the whitespace model
> or the value of xml:space they should all produce identical ESIS (right?)
> If not, then one or more is wrong. And all applications should (IMO) be
> prepared to work with ESIS which I think is isomorphous with a WF XML
> document.
There are quite a few more XML parsers out there, including at least
one in TCL -- see
http://www.sil.org/sgml/XML.html#xmlSoftware
As for ESIS, there are some problems that we'd have to overcome first:
1) How should empty elements be represented? Right now, Ælfred generates a
startElement event immediately followed by an endElement event.
2) How should the XML declaration be represented? Should it appear as
a processing instruction, or should it be ignored?
3) How should space in element content be handled? According to the
spec, a DTD-aware parser should handle whitespace in element
content differently from whitespace in mixed content (Ælfred just
ignores whitespace in element content right now).
4) DTD-aware and non-DTD-aware parsers will handle whitespace in
attribute values differently. Non-DTD-aware parsers will treat all
attributes as CDATA, but DTD-aware parsers will treat tokenised
attributes specially, by stripping all leading an trailing
whitespace, and normalising internal whitespace to single spaces.
All the best,
David
--
David Megginson ak117 at freenet.carleton.ca
Microstar Software Ltd. dmeggins at microstar.com
http://home.sprynet.com/sprynet/dmeggins/
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list