XML tools and big documents

Don Park donpark at quake.net
Thu Sep 3 02:48:13 BST 1998

>For my implementation, for ot.xml (a 4 meg document) only about 1-2 megs of
>is used to store the 4 meg file in RAM due to all Names being cached at the
>parser level.  It also takes only 10-12 seconds with a P-120 running

My test results were from running on Atari 800 (just kidding <g>).  My test
machine is Pentium-133 with JDK 1.1.6 with JIT enabled.  Building DOM is a
slow process but there are intermediate forms I am investigating which cuts
down DOM loading drastically.

>JIT for JDK 1.2 b4 to build the entire DOM tree.  For spitting out the DOM
>(and normalizing all the Text nodes) it takes about 15-20 seconds of which
>seconds is spent normalizing text nodes and most of the rest of this time
>actually spent in a brute force search and replace method that scans all
>character data and attribute values and replaces any occurrences of entity
>with entity names.  This can be very expensive but I know no other way
around it.

Why are you normalizing text nodes before writing them out?  Also, blindly
replacing entity values with entity names is error prone.

Don Park

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

More information about the Xml-dev mailing list