Well-formedness checker available
James Clark
jjc at jclark.com
Mon Dec 1 01:01:07 GMT 1997
I've enhanced my XML tokenizer to support multiple encodings and to
provide enough functionality that it can be used as the basis of high
performance full XML processors. As a proof of this, I've written a
well-formedness checker (xmlwf) on top of the tokenizer.
The main design goal was performance. On my portable (a 133Mhz Pentium
running Windows NT), it can check Jon's 3.7Mb ot.xml file in about
0.5sec (this compares to about 8sec for nsgmlsu and about 2sec for RXP
on the same system). It seems to be about 15% slower than the original
tokenizer. On the other hand, the size of the source and object code has
increased a lot. The source has also got a lot hairier.
The source code (in ANSI C) and Win32 binaries are available at:
ftp://ftp.jclark.com/pub/test/xmltok.zip
This is an alpha release. The only documentation is what you're reading
now.
To use the well-formedness checker, just give xmlwf one or more
filenames, and it will check that each one is a well-formed XML document
entity. There's a -g option which tells it to check instead that each
file is a well-formed XML external general text entity.
James
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list