SAX: Byte Stream Needed?

David Megginson ak117 at
Wed Apr 15 13:29:14 BST 1998

James Clark writes:


 > InputStreamReader, however, leaves something to be desired because
 > it doesn't allow users to supply their own character-to-byte
 > conversion routines. But if you have an InputStream you should be
 > using the interface to the parser that takes an InputStream.  In
 > any case it's not practical to use an InputStreamReader for XML
 > because that won't deal with XML's rules for detecting encodings.

I have actually been toying with omitting the byte-stream parse()
method altogether, so that there would be only two parse methods:

  public abstract void parse (String publicId, String systemId)
    throws java.lang.Exception;

  public abstract void parse (String publicId, String systemId,
                              SAXCharacterStream input)
    throws java.lang.Exception;

I've defined SAXCharacterStream as follows:

  public interface SAXCharacterStream {
    public abstract int read () 
      throws SAXException;
    public abstract int read (char ch[], int start, int count) 
      throws SAXException;

(Where SAXException is, in the Java version, a direct and unmodified
subclass of  The result of either method is -1
if there are no characters left to read; otherwise, it is a UTF-16
character value for the first, and the number of characters read for
the second.

The advantage of using SAXCharacterStream is that behaviour over CORBA
(or, I suppose, DCOM) is now well-defined.  The disadvantage is
another bloody interface.

I had also written a SAXByteStream, but then I started wondering why
we really need it -- information coming from a database, for example,
or from a buffer should already be in characters, not in raw bytes
(and in Java, at least, it is simply to wrap a Reader around any
InputStream when necessary -- I expect that other languages will have
good internationalisation support soon).

Can anyone put forward a convincing case for having a standard SAX
method parsing from a raw byte stream (remembering that
implementations can always extend the SAXParser interface themselves
for special requirements)?

Thanks, and all the best,


David Megginson                 ak117 at
Microstar Software Ltd.         dmeggins at

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list