SAX: Parser Interface -- Summary of Change Requests
tyler at infinet.com
Mon Feb 2 21:52:23 GMT 1998
David Megginson wrote:
> Tyler Baker writes:
> [on reading XML from a stream rather than a URI]
> > Well, what if the XML data is streamed from a database where a URL
> > does not matter so much. If you look at what Oracle, Sybase, and
> > Microsoft among others are planning on doing with XML, then
> > supporting this with SAX in the most ubiquitous way will be very
> > much necessary. I think that if you want to make SAX have any
> > CORBA support or other language support down the line, it would be
> > best to negate any polymorphism in the API cause in CORBA for
> > example, you cannot redefine operations in IDL (methods in Java).
> This is a good point, but there are complications. Do these vendors
> plan to use character streams or byte streams?
In CORBA IDL there is a string and a wstring type. The wstring type maps to
Unicode in the IDL -> Java mapping. You could define everything as wstring if
you wish as far as IDL is concerned.
> > Another idea (as far as implementation goes) is to have the parser
> > simply be an extension of java.io.FilterInputStream which takes an
> > one or more Handler interfaces as arguments (to delegate to), so
> > that you can handle very large streams of data.
> This sounds like an interesting idea for a parser implementation, but
> since SAX is meant to work with many parsers in many languages, it is
> probably too constraining as a general common interface.
Yah I only meant as for the implementation, but on another note, I think that the
Handler interfaces are by far and away the most important ones. Really, if
Aelfred had an XMLInputStream which could be derived out of Parser either by
having the parser be an implementation of XMLInputStream itself, or else
assigning a parser stub to XMLInputStream which could be retrieved by calling,
Parser.getXMLInputStream(). Parser.parse() would just parse everything with no
control over IO, but with XMLInputStream you could have control at the IO level
Furthermore, having a handler registry of SAX Handler interfaces (or just
pointers to where the class implementations live) would be invaluable to the
particular application I am working on now. I suggested having a static
registerHandler method in XMLInputStream, but you could add this to Parser
instead. This way you could simply pass in XML data and the parser would look up
the appropriate handler implementation for that doctype and load it dynamically.
Otherwise, this needs to be done manually and can really bloat your code at the
application level since you will have to essentially have a large number of
if/else statements and register the appropriate handlers manually. If this was
implemented in Aelfred or any other parser, you would already remove a huge
burden off of the application developers utilizing XML IMHO.
> [on get* methods for handlers]
> > Not sure exactly what the use of these get methods is for cause all
> > the handlers are useful is delegation anyways. The only reason the
> > get methods would be useful is for casting the returned object to
> > some other form. Why anyone would need to do this is beyond me as
> > recasting this object back to something would be sloppy
> > implementation in the first place.
> Delegation itself might be enough justification, though -- we'll have
> to wait and see what others suggest.
I think it would be better to have an addDocumentHandler() instead of
setDocumentHandler() if you wish to do delegation. This is an
Observer/Observable pattern that would work quite nicely. You could have
multiple objects register interest in the parsing of the XML data and have the
events delivered to them appropriately. You might even make all of this beans
compliant if you really want to.
> > The default handler could just be something which spits stuff out
> > to stdout or some other OutputStream in a manner similiar to how
> > Aelfred's EventDemo does.
> It would probably be best for the default handler to produce no output
> at all, so that other handlers delegating to it would not end up
> creating bloated log files.
Yah, I kinda overlooked this. I just thought it would be nice for debugging. My
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev