Internal subset equivalent in new schema proposals?

Thu Nov 26 17:12:50 GMT 1998

 From: Michael Kay <M.H.Kay at eng.icl.co.uk>

>A document is information organised for human communication; data is
>information organised for machine processing. XML can do both, but I stick
>with my original claim that it is optimised for the former.

PDF is a better example of  information organized for optimal human
communication. A "publication" is information organized and optimised for a
particular use. A document can be published as PDF for humans, or in binary
for some computer, or in XML for both (or neither, which seems to be what
you are saying).

>Rather my complaint was about things that I'd like to do in the data
>interchange world but can't. As Ron says, I can't do data typing in XML
1.0,
>and Paul's explanation doesn't alter the fact.

XML is not a "data exchange language" but a "data markup language". More
than
ASCII, less than serialization.

Attribute types are IMHO fundamentally not geared to data typing but to
marking up structures:
    1) link anchors to elements or entities or notations
            (ID, IDREF, IDREFS, ENTITY, NOTATION)
    2) link anchors that can be used for other XML names, like attribute
values
            (NMTOKEN, NMTOKENS)
    3) err, everything else
            (CDATA)

Thus the attribute types are really only concerned with marking up other
structures within XML. Apart from CDATA you can say that in fact all XML
attribute typing is to aid the markup of  links internal to a document.

NMTOKEN can be explained as a way to catch both references to XML names and
as a way to allow tokens which do not conform to full ID syntax to be used
as ids (i.e. when people mark up a document using simple numbers as the
convention for forming the unique identifiers for elements). IMHO the fact
that NMTOKENS provides a nice way to stick other tokens in is purely a happy
accident, and not a serious attempt to provide any kind of better data
typing.

So rather than saying that XML provides only limited built-in data types for
attributes, I think it is fairer to say that XML provides *ABSOLUTELY NO*
data types. The typing that is present is there for marking up structures
(not data values per se) and making connections between nodes: for internal
links, not for data typing.

Should XML provide better data types? I would say no: not in the XML 1.0
spec.

Should XML provide better support for other data typing layers (schemas)? I
would say yes: the notation idea should be clarified (I really think the SIG
and some of the WG did not have much idea what it was for when the spec was
made: in particular the relation between MIME content types and notations
should have some policy made: is a notation a name which some script uses to
key code or is the address of code which should be downloaded and run?) and
the WebSGML "DATA notation" attribute should be introduced: it lets you add
any notation name to any attribute:
    <!NOTATION iso-8601-date SYSTEM "ba;h blah">
    <!!ATTLIST x
            z DATA iso-8601-date #REQUIRED >
I think this is needed so that attributes and elements are more
interconvertable.

>I wasn't complaining that it contains many redundant features which would
>not be there in a data-oriented syntax. I have learned not to use those
>features, and I try to explain patiently when people ask yet again whether
>they should be using elements or attributes...

If you do not use them, it is presumably because you do have the particular
kinds of convoluted links in your data. If you find you only use CDATA, it
is a sign that your data is nicely tree structured without need for
graph-structures (ID/IDREF) or multiple links to external resources (ENTITY)
or you don't need to add data-typing to elements  (NOTATION) or references
to element type names, attribute names, or enumerations in attribute values
(NMTOKEN).

These are not redundant, in fact they are orthogonal and rational.

Rick Jelliffe

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)