recursion in XML parser

roddey at us.ibm.com roddey at us.ibm.com
Tue Apr 20 20:11:50 BST 1999




>Are most XML parsers recursive in nature?
>My parser in non-recursing while processing the tags from an XML file
>and only recurses once to go back and load an XSL file, when applicable.
>

There are some places where mine is recursive, e.g. parsing and otherwise
manipulating the tree like content models in the DTD. But no, there is not
really much reason to recurse while parsing XML. We also do recurse while
handling INCLUDE/IGNORE sections, because it simplies some things (and
hopefully no one is going to have 100 nested levels of INCLUDE/IGNORE, and
if they do I'm not too worried about impressing them :-)

>My reasoning for not using recursion was performance (function call/stack
framing
>considerations) and that it made the code easier to understand.
>

Its also more than that. Once you get into certain things, you might have
to search back down the element nesting tree in a very efficient way, and
having it on the stack would make that pretty difficult. Having your own
element stack (and probably they all do), makes it straightforward.

>It would be interesting to do some benchmarks on various parsers out there
to
>measure performance.  The Java parsers I've tested (Sun, IBM) are _dog_
slow
>compared to expat, etc.  For server-side I don't think that matters, since
in
>the corporate scene people tend to just add more servers/infrastructure
and
>not worry about performance.
>

In our own defense, our second generation parsers are not dog slow compared
to anyone, though they are still slower than Expat. But there are other
reasons than pure sloth for this. Large companies like us have to consider
extensibility and the ability to serve many masters. So we cannot write a
highly compact parser that can fly but not be easily extensible. We have to
support gazzillions of encodings, and many different configurations. When
the blessed schema arrives, we have the architecture to handle it without
rewriting it and without affecting our customers. That's extremely
important and it has a certain amount of associated cost.

If you can write a parser that's as fast as Expat and as flexible as ours,
I'm sure you'll be a hero to your people, and I'll buy you a six pack of
your favorite import.



xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list