XML tools and big documents

Thu Sep 3 02:48:13 BST 1998

>For my implementation, for ot.xml (a 4 meg document) only about 1-2 megs of
RAM
>is used to store the 4 meg file in RAM due to all Names being cached at the
>parser level.  It also takes only 10-12 seconds with a P-120 running
Symantec's

My test results were from running on Atari 800 (just kidding <g>).  My test
machine is Pentium-133 with JDK 1.1.6 with JIT enabled.  Building DOM is a
slow process but there are intermediate forms I am investigating which cuts
down DOM loading drastically.

>JIT for JDK 1.2 b4 to build the entire DOM tree.  For spitting out the DOM
tree
>(and normalizing all the Text nodes) it takes about 15-20 seconds of which
5
>seconds is spent normalizing text nodes and most of the rest of this time
is
>actually spent in a brute force search and replace method that scans all
>character data and attribute values and replaces any occurrences of entity
values
>with entity names.  This can be very expensive but I know no other way
around it.

Why are you normalizing text nodes before writing them out?  Also, blindly
replacing entity values with entity names is error prone.

Don Park
Docuverse

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)