SAX: finalising org.sax.xml.Parser
David Megginson
ak117 at freenet.carleton.ca
Mon Feb 23 03:14:26 GMT 1998
It's time to finalise SAX before there is such a big code base that we
can no longer make changes. (Thanks, by the way, to James Clark,
DataChannel, and IBM for including native SAX support in their XML
parsers). During this phase, I'd like to make the _minimum_ changes
necessary SAX to define a consistent and simple common functionality
for XML parsers.
Let's start with the Parser interface. I'll use Java syntax because,
while I can read IDL, I don't trust myself to write it:
[current interface]
------------------------------------------------------------------------
package org.xml.sax;
public interface Parser {
public void setEntityHandler (EntityHandler handler);
public void setDocumentHandler (DocumentHandler handler);
public void setErrorHandler (ErrorHandler handler);
public void parse (String publicID, String systemID)
throws java.lang.Exception;
}
------------------------------------------------------------------------
After considering the various discussions over the past few weeks, I
propose that we make the following changes:
1) Add a parse() method that accepts a stream.
2) Add a parse() method that accepts a character buffer.
3) Remove public ID from the current parse() method (I don't think
public IDs are going anywhere fast in XML).
With these changes, the interface would look like this in Java:
[proposed changes]
------------------------------------------------------------------------
package org.xml.sax;
import java.io.InputStream;
public interface Parser {
public void setEntityHandler (EntityHandler handler);
public void setDocumentHandler (DocumentHandler handler);
public void setErrorHandler (ErrorHandler handler);
public void parse (String uri)
throws java.lang.Exception;
public void parse (InputStream is, String baseURI)
throws java.lang.Exception;
public void parse (char ch[], int start, int length, String baseURI)
throws java.lang.Exception;
}
------------------------------------------------------------------------
NOTES:
a. The baseURI argument is necessary for streams and character buffers
in case either contains a relative URI. You can supply a null
value if the document entity will not contain relative URIs.
b. All programming languages initially targeted by SAX (Java, C++, C,
Perl) have some concept of input streams; if we come up against one
that doesn't, it can simply omit the relevant method.
c. The start and length arguments are necessary with the character
buffer in case the XML document is part of a larger array.
Does this give reasonable functionality without limiting the
architectural approaches of parser writers? Remember that individual
implementations can extend this interface, but the interface
represents the minimum common functionality that every SAX-conformant
parser (eventually) provides.
Thanks, and all the best,
David
--
David Megginson ak117 at freenet.carleton.ca
Microstar Software Ltd. dmeggins at microstar.com
http://home.sprynet.com/sprynet/dmeggins/
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list