Transition to XML (was Re: Opportunities for XML-DEV)

Lisa Rein lisarein at
Fri Sep 18 19:14:16 BST 1998

Simon St. Laurent wrote:

  Cleaning up the top pages of
> my site took about 10 minutes.  Other pages will require more work - yes, I
> plan to go back and fix everything except perhaps the 100+ XSchema fragments.

It took me WEEKS just to clean up the (pretty horrid) HTML on my little
site -- but now it is at least HTML 4.0 valid.  I feel this is an
important first step to first get everyone to at least be well-formed
HTML, and THEN start pushing for xml transition.  

The XML transition itself will be more easily accomplished from a web of
well-formed HTML anyway.  In fact, if I have my dithers, much of it will
be automatable.   

For now, remember that if web documents that are not well formed are
near-useless to machines.  So it just depends on whether you want the
information on your site -- all of it -- to be machine-accessible -- and
that's not even getting into real accesibility issues (WAI, ICAAD,
etc.)-- many of which also depend on such well-formedness.  

Many sites that I have been trying to access with xml-enabled apps are
dead-ends due to non-well-formed pages.  In order to accomplish creating
an XML directory, for example, I would like to be able to index at the
very least the websites of the members of this list -- but when I tried
many were inaccessible ;-(

So then I am forced to either "scrape" the data off of your sites, and
regenerate versions of it that can be indexed -- which is another pain
(shame on you all ;-) -- or keep looking for other well-formed sites --
a frustrating search which usually just leads me back to my own site.

Simon's other point (about the need to start migrating ourselves) hits a
sore spot with me because I've been getting called on this lately --
It's pretty embarrassing when some one calls you on not practicing what
you preach -- especially with what some still view (unfortunately) as
the a "religion" of xml. 

Most HTML programmers can understand the sacredness of well-formed docs
(-- although they might not get the value of syntax preservation in
general.)  I'm not sure why the well-formedness thing gets across, but
it does.  We can use this to everyone's advantage.

So soon I will be releasing my own little indexing system for my own
little site (of well-formed HTML, at first, not xml pages -- and then
mirrored xml'd versions of those same pages -- but i would love to
integrate the content of others -- i've just grown tired of looking for
well-formed pages)

So if your site has well-formed pages -- and you'd like to be included
in a little experiment -- let me know privately.  



xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list