XML parsing memory overhead concerns

Richard Tobin richard at cogsci.ed.ac.uk
Fri Dec 17 15:50:52 GMT 1999


Though James Clark has explained how you can do what you want with
expat, you might be interested in LT XML.  LT XML was originally
written to handle large natural-language corpora, some of which are
several gigabytes.  It allows you to read "bits" sequentially, and
when you find a start tag that you are interested in, easily fill in
the tree rooted there.  It can also validate if desired (validation
requires memory to store a table of IDs and IDREFs, but still works
without building a tree).

-- Richard

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list