SAX: Parser Interface -- Summary of Change Requests
David Megginson
ak117 at freenet.carleton.ca
Sun Feb 1 20:43:19 GMT 1998
As promised, I will now begin to summarise the requested changes to
SAX before we put out a stable 1.0 version: over the next few days, I
will send out one message summarising the requested changes to each
interface or class. For more information on SAX, see
http://www.microstar.com/XML/SAX/
There have been only two changes proposed to the Parser interface,
both of which would be backwards-compatible with existing
implementations:
1) Allow SAX to work with an input stream as well as a URI.
2) Simplify handler chaining by adding get* methods for existing
handlers.
Here are the change requests in detail, with my initial response at
the end of each one:
1) Allow SAX to work with an input stream as well as a URI.
- Paul Pazandak <pazandak at OBJS.com>
- Peter Murray-Rust <peter at ursus.demon.co.uk>
- Don Park <donpark at quake.net>
Currently, the Parser interface provides only the following method
to initiate a parse:
void parse (String publicId, String systemId)
throws java.lang.Exception;
Following this suggestion, there would be a new method
void parse (String publicId, String systemId, InputStream input)
throws java.lang.Exception;
(It is still necessary to provide a system identifier for resolving
relative URIs within the stream). Note that the stream would be a
byte stream, not a character stream -- characters might require
more than one octet, depending on the encoding in use.
I can see the convenience of this method, and I plan to add
something like this to AElfred when I have a chance. For SAX,
however -- which is meant to end up as a language- and
system-independent API -- I am reluctant to hardcode assumptions
about storage (and I don't know enough about IDL to know if there
is a general representation for streams). Paul Pazandak has also
suggested allowing strings and buffers -- in this case, they would
already be decoded into characters.
Personally, I'm undecided, and would be interested in hearing the
theoretical arguments for and against this suggestion.
2) Simplify handler chaining by adding get* methods for existing
handlers.
- Don Park <donpark at quake.net>
Currently the Parser interface provides only setters for the
various handlers:
public void setEntityHandler (EntityHandler handler);
public void setDocumentHandler (DocumentHandler handler);
public void setErrorHandler (ErrorHandler handler);
Following this suggestions, there would also be accessors:
public EntityHandler getEntityHandler ();
public DocumentHandler getDocumentHandler ();
public ErrorHandler getErrorHandler ();
An application could then retrieve the existing handler and
implement a new one which invokes the old one under certain
circumstances.
This seems like a generally good idea (as will as a simple and
backwards-compatible change), and I am willing to implement it.
The only complication is that we'll have to define the default
state -- is the parser always required to return a default handler
if the user has not explicitly set one, or should it return null?
I look forward to your comments and suggestions.
All the best,
David
--
David Megginson ak117 at freenet.carleton.ca
Microstar Software Ltd. dmeggins at microstar.com
http://home.sprynet.com/sprynet/dmeggins/
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list