Why XML data typing is hard (was Re: Internal subset equivalent in new schema proposals?)

Ketil Z Malde ketil at ii.uib.no
Fri Nov 27 14:21:30 GMT 1998

<david at megginson.com> writes:

> The real question, though, is how constraints could be enforced.
> Let's start with an extremely simple example:

>   <value xml:type="float"></value>

Now you're adding type information to the content, what I suggested
was to constrain *form*.  For one thing, I would not specify this in
the document (this is just a gut feeling, but why would you?), I would
specify it in the DTD, e.g. like so:

	<!element value #REGEXP:"-?[0-9]*.[0-9][0-9]">

(or some such, you get the point).  

> What are the allowed contents?

Then the document could contain

	<value>-0.01</value> or

but not

	<value>1.0</value> or

>   <value xml:type="float">1,5</value>
>   <value xml:type="float">1.5</value>

This won't be a problem, if the DTD specifies what can the processing
software should expect.  You could even validate processing software
to some extent. 

> This is a very simple example; after you've worked this out, you can
> start worrying about how to count combining characters with
> field-length restrictions, etc.

I think trying to define some set of types to be used in *all* XML
documents is taking the wrong approach.  I don't really see this as
either workable or desirable.  What would the point of using xml:type
be?  As I said, I haven't given this a lot of thought, but to me, it
seems like having elements which take multiple types that need to be
identified in attributes would be an indication of an ill designed
document type.  (What would, in the example above, the semantics be if 
you supplied an xml:type="string" in a value field?)

Specifying subsets of #PCDATA as allowable content, however, should be
relatively simple and occasionally useful.  But hey, I'm in
telecommunications these days, not document processing :-)

If I haven't seen further, it is by standing in the footprints of giants

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

More information about the Xml-dev mailing list