Basic Question

James Tauber jtauber at
Fri Mar 12 16:45:04 GMT 1999

-----Original Message-----
From: Dan Rudman <rudman at>
>With the wealth of XML libraries available, I am more and more inclined to
>make use of these libraries to help me create, parse, and utilize my own
>markup language to be embedded within an HTML document.  My understanding
>XML at this point is that it must be well-formed or a fatal error occurs.

Yes, this is correct.

>If this is the case, how can I deal with the fact that most HTML documents
>are NOT well-formed and that most HTML design tools do not enforce,
>or even sometimes support, well-formedness in a document?

You might try Tidy as the initial step. Tidy can take bad HTML and spit out
XML that could then be parsed by any XML parser.


Hope this helps.

James Tauber / jtauber at /
Associate Researcher, Electronic Commerce Network
Curtin University of Technology, Perth, Western Australia

Full-day XML Tutorial @ WWW8 :

Maintainer of :, and

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as: and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list