why distinctions within XHTML?

Mark Birbeck Mark.Birbeck at iedigital.net
Mon Aug 30 23:25:06 BST 1999


Simon St.Laurent wrote:
> Mark was suggesting earlier that those namespaces would 
> provide information connected to schemas.  This does not
> appear to be a good assumption, which is most of what I was
> saying.

Simon - as you are, I am "outside looking in", and you're right it can
be frustrating. However, my reason for making this assumption stems from
the following flow:

 1. The biggest barrier to the uptake of XML will not be the
    popularity or competence of W3C members (just kidding)
    but the ability to convert legacy data.
 2. The easiest way to handle legacy data is to convert it to
    some simple XML, and then take advantage of XML techniques
    to make it 'nicer'.
 3. Most legacy data either exists in, or can be easily converted
    to, HTML. (It sits behind web servers.)
 4. The quickest initial transformation then, is to get HTML into
    an XML format that looks pretty much like HTML.
 5. This can then be tidied up into more meaningful tagged data
    (one of the goals of XML lest we forget!).
 6. There are three variants of HTML 4.0 so we need three variants
    of 'HTML 4.0 as XML' (let's call it XHTML).
 7. XHTML will be around for a little while in these variants as
    browsers catch up with this evolution.
 8. Eventually these variants will give way to a number of modules
    that handle the different features of HTML, such as a code
    module, a table module, an image module, and so on.
 9. Once 'pure' XML documents are being sent to browsers then we
    will want to mix and match other XML data with the display
    information that is XHTML. This may mean putting XHTML inside
    other documents, or other documents inside XHTML.
10. Current DTDs cannot handle this, but XML Schema type solutions
    will be able to.
11. In the short-term we therefore need a schema and a DTD for each
    variant of XHTML.
12. But in the long run we will have schemas for each module.

Note that unless you are mixing documents you do not need to use XHTML
anyway, unless you plan to store the document in a system that requires
the document to be validated. You can *produce* XHTML from your XSL
transformations, but who is going to check it? No current browser is.

VALIDITY TODAY
So for me, the validity issue *for today* comes when I want to take
legacy data and make it more meaningful (as per point 5, above). Say I
have a web page from a client's intranet that has a list of all their
offices, and I want to convert that to XML. For example, I want to go
from:

<TABLE>
    <TR><TD>Office 1</TD><TD>London</TD></TR>
    <TR><TD>Office 2</TD><TD>Birmingham</TD></TR>
    <TR><TD>Office 3</TD><TD>Glasgow</TD></TR>
</TABLE>

to:

<Offices>
    <Office>
        <Name>Office 1</Name>
        <City>London</City>
    </Office>
    <Office>
        <Name>Office 2</Name>
        <City>Birmingham</City>
    </Office>
    <Office>
        <Name>Office 3</Name>
        <City>Glasgow</City>
    </Office>
</Offices>

without using Notepad. The HTML file has to be converted to XHTML,
validated and then transformed. Each of the following will display in IE
and Netscape, but would break my final transformation:

<TABLE>
    <TR><TD>Office 1<TD>London
    <TR><TD>Office 2<TD>Birmingham
    <TR><TD>Office 3<TD>Glasgow
</TABLE>

<TABLE>
    <TR><TD ALIGN=LEFT>Office 1</TD><TD>London</TD></TR>
    <TR><TD ALIGN="LEFT">Office 2</TD><TD>Birmingham</TD></TR>
    <TR><TD>Office 3</TD><TD>Glasgow</TD></TR>
</TABLE>

but if you automate the process of converting to XHTML, it will map to a
known 'standard' file.

VALIDITY TOMORROW
The validity issue *for tomorrow* is the mixing of mark-up from
different XML-based languages within one document, and still being able
to check that all is OK. 

Best regards,

Mark

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list