half-baked parsers vs binary XML

David Megginson david at megginson.com
Mon Mar 29 13:27:33 BST 1999

Gabe Beged-Dov writes:

 > Another reason (other than the binary XML thread) that I brought
 > this up was discussion on the perl-xml mailing list of whether
 > XML::Parser was usable for soft real-time server side
 > processing. The consensus there seems to be no.

The speed bottleneck, however, is Perl, not Expat: if you were acting
off a different kind of input, it would still take just as long to
execute the Perl handlers for the start and end of each element, etc.

In other words, it's not the XML *input* that you need to optimize,
but the *output* -- for example, if you have a Perl script that
renders XML in HTML, the best speed optimization is to cache the
result and reserve it for any request with the same parameters.  

The XML/SGML processing model is generally to walk through a document
(as a collection of events or as a tree) and fire off handlers for
different types of things.  Even a short to medium-length XML document 
can cause the handlers to be fired off many thousands of times, and if 
you're trying to handle hundreds of requests per second, that's going
to cause problems with or without XML.

In some cases, the query processing model might help things,
especially if the query code is moved into C or C++.

All the best,


David Megginson                 david at megginson.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

More information about the Xml-dev mailing list