Internal subset equivalent in new schema proposals?

Michael Kay M.H.Kay at eng.icl.co.uk
Thu Nov 26 14:17:47 GMT 1998


>> > Argh. Documents are data. The dichotomy is in your head. Doesn't XML
>> > itself makes this abundantly clear?...
>>
>> Documents might be data, but the dichotomy is not just in our heads.  XML
>> has a clear bias towards linear, prose-oriented verbiage.  How else to
>> explain mixed content, the significance of order, ...
>
>I find it interesting that you use the fact that XML supports things that
>other languages do not [as] evidence of XML's bias...

A document is information organised for human communication; data is
information organised for machine processing. XML can do both, but I stick
with my original claim that it is optimised for the former.

I wasn't complaining that it contains many redundant features which would
not be there in a data-oriented syntax. I have learned not to use those
features, and I try to explain patiently when people ask yet again whether
they should be using elements or attributes...

Rather my complaint was about things that I'd like to do in the data
interchange world but can't. As Ron says, I can't do data typing in XML 1.0,
and Paul's explanation doesn't alter the fact.

The complaint in my original post was my recent discovery that the internal
DTD subset destroys many of the assumptions I have made in my applications
about the conformance of the incoming document to a schema.  Specifically:

- if I'm quite clever with my EntityResolver I can ensure that the input XML
uses a particular external DTD
- if I choose, I can ensure that my application uses a validating parser
- but I can't stop the input XML using an internal DTD subset which
overrides declarations in my external DTD, nor can I (using SAX) detect that
it has done so.
[ - so my application must be prepared for anything
  - so my application has to do full validation itself
  - so there's not much point having a DTD in the first place ... ]

Actually, the problem is not quite as bad as this: the internal DTD subset
can override constraints in my attribute declarations but not in my element
declarations. Let us be thankful for small mercies. This seems to be another
reason for using elements rather than attributes, which I will add to my
standard answer on the question: the very limited data typing available for
attributes can be overriden at the whim of the user!

Mike Kay


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list