SAX: Byte Stream Needed?
David Megginson
ak117 at freenet.carleton.ca
Wed Apr 15 13:29:14 BST 1998
James Clark writes:
[...]
> InputStreamReader, however, leaves something to be desired because
> it doesn't allow users to supply their own character-to-byte
> conversion routines. But if you have an InputStream you should be
> using the interface to the parser that takes an InputStream. In
> any case it's not practical to use an InputStreamReader for XML
> because that won't deal with XML's rules for detecting encodings.
I have actually been toying with omitting the byte-stream parse()
method altogether, so that there would be only two parse methods:
public abstract void parse (String publicId, String systemId)
throws java.lang.Exception;
public abstract void parse (String publicId, String systemId,
SAXCharacterStream input)
throws java.lang.Exception;
I've defined SAXCharacterStream as follows:
public interface SAXCharacterStream {
public abstract int read ()
throws SAXException;
public abstract int read (char ch[], int start, int count)
throws SAXException;
}
(Where SAXException is, in the Java version, a direct and unmodified
subclass of java.io.IOException). The result of either method is -1
if there are no characters left to read; otherwise, it is a UTF-16
character value for the first, and the number of characters read for
the second.
The advantage of using SAXCharacterStream is that behaviour over CORBA
(or, I suppose, DCOM) is now well-defined. The disadvantage is
another bloody interface.
I had also written a SAXByteStream, but then I started wondering why
we really need it -- information coming from a database, for example,
or from a buffer should already be in characters, not in raw bytes
(and in Java, at least, it is simply to wrap a Reader around any
InputStream when necessary -- I expect that other languages will have
good internationalisation support soon).
Can anyone put forward a convincing case for having a standard SAX
method parsing from a raw byte stream (remembering that
implementations can always extend the SAXParser interface themselves
for special requirements)?
Thanks, and all the best,
David
--
David Megginson ak117 at freenet.carleton.ca
Microstar Software Ltd. dmeggins at microstar.com
http://home.sprynet.com/sprynet/dmeggins/
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list