Peter Murray-Rust Peter at ursus.demon.co.uk
Sat Jun 14 11:05:44 BST 1997

In message <9706140601.AA10887 at MIT.MIT.EDU> Hyung-Jin Kim writes:
> I'm new to this list so I apologize if this question has been answered already:

We have not had enough discussion about HTML on this list - and I, for one,
would like version(s) of XMLised DTDs and documents.
> I was wondering if anyone knew of an parser that made well-formed XML files
> from HTML files.  I know of a few tools that can DETECT mal-formed tags in
> HTML (i.e. weblint) but is there a tool that will do the conversion?
> Thanks!  Please reply directly to me.

Mal-formed HTML (i.e. non-conforming SGML) is outside the scope of this list 
:-).  However, converting legal HTML (i.e. conforming SGML) to XML is a valid
activity and it could be useful to get feedback.  It normally requires a 
DTD (for example <!Element body o o (%body.content;)> means that <BODY> tags
are frequently omitted.  There is also the question of what to do with EMPTY
tags such as <HR>.  Does it matter if they are rendered as
or, say
I convinced myself that it did, in that the first has no child, while the
second could have a PCDATA child of value "\n" - at least in WF documents?
What is its value in <HR></HR>?

Could someone more authoritative give an overview of the XML-isation of HTML?
I need HT(X)ML to provide the text sections for CML...


Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences

xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa at ic.ac.uk)

More information about the Xml-dev mailing list