HTML to XML

Peter Murray-Rust peter at ursus.demon.co.uk
Thu Jul 16 17:13:30 BST 1998


At 15:32 16/07/98 +0100, Michael Kay wrote:
>>
>PMR>The SwingSet (from com.sun.java) has HTML functionality.
>>I'm not sure
>>exactly what, but it can read in HTML and render it.
>
>Good thinking. I've had a look at the Swing source. It
>includes a parser (html32.java) generated using the java
>compiler-compiler JavaCC. This calls a callback interface
>HTMMLParserCallback.java, similar in concept to SAX, though
>it seems to include both generic (start/end element) and
>element-specific (e.g. startUL) callbacks. Of course the
>main difference from a SAX application will be that the
>elements are not properly nested.

I assume that *after* it has parsed the HTML there is something isomorphic
to a DOM inside and this must have properly nested elements. [I appreciate
that really bad HTML may give erroneous results, but that is not our
problem.]. It should then be possible to extract this as xHTML and:
(a) mix it with other XML
(b) render it in the Swing Component

The other thing I'd like to do is generate HTML as a string (e.g
String s = "<P><EM>Hello</EM> world</P>";
and pass s to the renderer. Any idea how?

This would then give us part of an XML2HTML renderer in Swing

	P.

BTW I am making progress with JUMBO2beta. I have had a moderate amount of
feedback, all of which was very useful. I am really keen that we use Swing
to enhance our functionality here. Also, someone told me yesterday that the
next versions of Swing were promised to be much faster and better...RSN

	P.

>
>Mike Kay
>
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
>To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
>
>
Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list