SAX: Whitespace Handling (question 5 of 10)
David Megginson
ak117 at freenet.carleton.ca
Sat Jan 3 18:07:01 GMT 1998
[SAX is a proposal for a simple, event-based XML API, using
callbacks. This is one in a series of ten design questions that we
need to answer to implement the API.]
Should SAX allow DTD-driven parsers to distinguish ignorable
whitespace from other character data?
public void ignorableWhitespace (char ch[], int length);
(We have already had some discussion on this topic.)
CON
---
- this method would make SAX slightly larger;
- parsers that use the DTD will return different results than parsers
that do not (though it would be trivial to map the two);
- the concept of ignorable whitespace can be confusing for
non-specialists.
PRO
---
- the PR requires "validating" parsers to flag ignorable whitespace
for the application;
- there would be no need to implement anything here for most
applications;
- whitespace in element content is almost never significant for
formatting or database applications (if it were significant, then
the element type would have mixed content).
MY RECOMMENDATION
-----------------
Qualified no.
As someone who has worked with SGML for many years, I would rather not
see the ignorable whitespace at all; however, the PR requires parsers
to report all whitespace.
Tim Bray's recent comments on this list imply that a validating parser
using SAX could report ignorable whitespace as regular character data
and still be conforming; if I have inferred correctly, then I am
willing to omit this callback.
OTHER CONSIDERATIONS
--------------------
It would also be possible to implement this in the charData callback
itself:
public void charData (char ch[], int length, boolean isIgnorable);
However, given that charData will probably be the most
heavily-implemented handler, and that very few applications will care
about ignorable whitespace, I would prefer not to complicate things
unnecessarily. If we need to distinguish it to be conforming, then
ignorable whitespace should probably be shuffled off to its own
callback, to make it easier to ignore.
All the best,
David
--
David Megginson ak117 at freenet.carleton.ca
Microstar Software Ltd. dmeggins at microstar.com
http://home.sprynet.com/sprynet/dmeggins/
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list