SAX2: Namespace proposal

David Megginson david at megginson.com
Mon Dec 20 14:06:35 GMT 1999


Stefan Haustein <stefan.haustein at trantor.de> writes:

> > For example, I think there are good arguments for moving to a
> > 
> > interface DocumentHandler {
> >   void startElement(StartElementEvent event)
> >   void endElement(EndElementEvent event)
> >   ...
> > }
> 
> I also would prefer this kind of interface. Further advantage besides
> the improved extensability might be that 
> 
> - building a new object seems some overhead at the first sight, 
>   but in JAVA also a new String is a new object...

And that is why most parsers internalize strings rather than creating
new ones, and that's why the SAX characters() and
ignorableWhiteSpace() methods use character arrays rather than
strings.  XML parsing shows up a lot of problems that Java programmers 
aren't used to, because it generates so many events (often tens of
thousands) in only a few seconds.

In this case, however, the real solution is internalizing but reusing
-- the SAX driver would have only one copy of each kind of event
object, and would simply change its values each time it makes a
callback.  The problem with this approach (it showed up before in C++
with a simple SP API that James Clark made) is that programmers will
try -- despite documentation warning against it -- to keep the event
objects around and reference them outside the scope of the callback,
where strange things will happen.

> - some computation could be performed on demand only(?) 

This might be an advantage, but probably not -- the parser will
probably have done all of the work anyway, because of basic
constraints for checking well-formedness, etc.

> - I think it is less difficult to remember the access method names 
>   than a more or less unmotivated order of a lot of parameters

Perhaps, but you have to remember a lot of class and method names.
I'm programmed to interfaces that use both approaches, and I did not
find either harder or easier -- on balance, I prefer to avoid bloating 
a low-level interface like SAX with a lot of extra classes, even if
the performance would be the same.

> - the ElementEvent access methods could be a subset of the DOM access
> methods (!)

No, I don't think we should go there.  SAX has its warts, and the DOM
has its warts, but any combination might give us the product rather
than the sum of their warts, and the world doesn't need that much
ugliness.

> please do not forget to include the "old" SAX 1 methods in HandlerBase
> and call them from the new methods as default behaviour preserving
> compatibility at least with applications extending
> HandlerBase instead of implementing DocumentHandler. 

Actually, if we create a new package, there will be no compatibility
at all, except by using adapters (aka filters).  There will certainly
be a SAX1Filter class to wrap around SAX 1.0 parsers, and there may
also be a SAX2Filter class to make SAX2 parsers act like SAX 1.0
parsers.


All the best,


David

-- 
David Megginson                 david at megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list