Parsing XML->DOM and XSL querying optimising

Fri Mar 19 06:46:38 GMT 1999

Hi all,

I have an application that consists of 140+ XML documents, roughly 100k
bytes each that I want to be able to query (using XSL pattern matching at
present) and output to XML/HTML and RTF format. This will happen in real
time (if at all possible).

Additionally, I'd like to be able to search/query the entire repository of
documents and return a composite XML/HTM or RTF document from these.

At the moment, I'm experimenting with the DOM parser in Python and finding
that a DOM parse takes about 4 seconds, whilst an XSL query takes about 1.8
seconds.

I reckon that a user could wait the 1.8 seconds for a query, but might start
to get fidgety after almost 6 seconds (how transient we are!).

What strategies have people got for limiting the DOM parsing time?

My own thoughts are that I load up all 140 documents at server-startup time,
parse them into DOM[0]...DOM[139], store them into memory and then query
each one in turn in the case of a simple query, and query all the DOM
objects in the case of a full query across all XML documents.

Is this sensible? practical? stupid?

any thoughts on this would be appreciated,
cheers,
tone.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)