Distributed DOM implementations

Jerome McDonough jmcdonou at library.berkeley.edu
Thu Jan 28 18:00:28 GMT 1999


At 04:20 PM 1/27/1999 -0800, Pavel Velikhov wrote:
> I have build a prototype of an XML database system, but I have found
>out that
>the XML parser is the chief performance bottleneck (DOM is my "storage
>model").
>I.e. to query a document I have to parse all of it, even though the
>query may
>concern only a tiny fragment of it. Instead, if the document is
>pre-parsed and
>stored in some efficient representation (either main memory tree or some
>hierarchial
>representation on disk) and is available via DOM commands, I would save
>a lot of
>time.
>

This is very similar to the approach we're taking to distributing archival
documents in the Making of America II project at Berkeley; we pre-parse
the documents and extract information from them using DOM into Java objects
which are then serialized and written to disk.  The objects provide a somewhat
higher level interface for delivering information to clients that want
to interact with the objects.  It does indeed make things a good deal
faster.





Jerome McDonough -- jmcdonou at library.Berkeley.EDU  |  (......)
Library Systems Office, 386 Doe, U.C. Berkeley     |  \ *  * /
Berkeley, CA 94720-6000    (510) 642-5168          |  \  <>  /
"Well, it looks easy enough...."                   |   \ -- /  SGNORMPF!!!
         -- From the Famous Last Words file        |    ||||

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list