SAX, non-XML Documents, and Legal Characters

David Megginson ak117 at
Mon Apr 13 16:20:12 BST 1998

While we're on the topic of character streams, here's another
question: should the SAXDocumentHandler.characters() method be allowed
to deliver only XML characters?

At first, the answer "no" might seem self-evident, but what if someone
decides to build a LaTeX or RTF parser that implements the SAX
interface?  Should we require the parser to strip out non-XML
characters before delivering the SAX events, or should we allow SAX to
be a general structured-document interface, and require applications
to strip out non-XML characters when exporting an XML document?

The question is, of course, moot for XML parsers, since they will have
to report a fatal error anyway if they find non-XML characters.  It
would be interesting, though, to build an RTF parser with a SAX driver
and then hook it up to Don Park's SAXDOM.

Any thoughts?

All the best,


David Megginson                 ak117 at
Microstar Software Ltd.         dmeggins at

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list