recursion in XML parser
marcelo at mds.rmit.edu.au
Sat Apr 17 04:14:30 BST 1999
On Fri, Apr 16, 1999 at 12:51:45AM -0700, David Brownell wrote:
> Marcelo Cantos wrote:
> > I can't speak for the JVM, but it is far from safe to generalise and
> > state that a function call is as fast as a stack push, particularly
> > when the programmer knows exactly what needs to be pushed.
> There's no reason a nonvirtual function call shouldn't compile to be
> just the stack operation.
> If you're using a virtual function call, the same reasoning applies
> as for a C++/Obj-C/... virtual function call. Namely, that one can't
> just say "it's not free"; a comparison needs to include the cost of
> an alternative with the same functionality. And curiously enough,
I heartily agree. In fact, that was my whole point. Don't just
assume that method X is better than method Y in any and all
circumstances (which is what the original statement effectively said).
> when you do those comparisons, the functionality seems to be
> relatively cheaper when packaged as a "virtual function call" than
> when packaged as an if/then/else/... set of data operations, or
> other alternatives.
> This discussion seems pretty odd to me. Exactly what alternative is
> being advocated?
None. I wasn't even advocating that XML parsers be implemented
non-recursively (too much hard work, frankly). I was merely pointing
out that it is dangerous to generalise (I don't see why such a warning
would be perceived as odd). We have often encountered situations were
manual recursion came out significantly faster than anything the
compiler could produce under any optimisation level.
Maybe in Java function calls are intrinsically as fast as (or faster
than) manual recursion under all conceivable scenarios. I am not a
Java expert, so I can't say. I would be surprised to find that this
was so, but who knows.
> Remember that per-element state _must_ be maintained
> when parsing XML, and the model is a stack. Whether that stack gets
> maintained using the CPU stack or explicit emulation in some other
> memory data structure, it'll be there. Function calls use the CPU
> stack, and clean up very efficiently. Explicit emulation uses a
> different memory segment; and needs more work to GC correctly.
I can see how a GC environment would tilt the scales somewhat (I
assume that the GC system knows not to look past the stack pointer,
whereas an array implementation would need to null unneeded values).
However, I explicitly stated that I was not talking specifically about
the JVM, hence it is a little premature to "remind" me of the cost of
manual GC management.
> > Moreover, modern architectures often penalise you heavily for deep
> > recursion. For instance, the SPARC architecture uses register
> > windowing. ...
> Which can be bypassed by modern compilers for those applications
> where it matters. For example, graphics algorithms tend to need
> lots of registers (e.g. VIS code) and device drivers need to have
> predictable latencies (that is, they can't afford to flush windows
> in a time-critical interrupt handler).
I don't see the relevance of this. Graphics algorithms shouldn't use
register windowing because they use lots of registers; this has
nothing to do with recursion. Device drivers shouldn't use register
windowing because they need real-time performance; hence register
windowing simply isn't an option.
There will still be cases, however, where register windowing provides
a significant amortised performance gain, but only if the code is
refactored to remove recursive calls. In fact, parsing XML may be
just such a case.
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev