Why XML data typing is hard (was Re: Internal subset equivalent in new schema proposals?)
david at megginson.com
david at megginson.com
Fri Nov 27 13:49:23 GMT 1998
Ketil Z Malde writes:
> Catching illegal values early on - in validation of the document -
> instead of relying on some obscure run-time error in some program,
> is a *feature*.
Agreed -- this is a very good choice, especially if you have human
authors.
The real question, though, is how constraints could be enforced.
Let's start with an extremely simple example:
<value xml:type="float"></value>
What are the allowed contents? Certainly, +, -, and the digits 0-9
should be allowed, as well as the letter 'e', but which of the
following should throw an error?
<value xml:type="float">1,5</value>
<value xml:type="float">1.5</value>
There are three obvious answers:
1. Both are accepted.
2. Only one is accepted, and everyone learns to use that format.
3. Only the correct one for the current locale is accepted.
Option #2 is politically unworkable (either France or the U.S. would
take up arms), and option #1 seriously weakens validation (what if an
English author had mistakenly intended to use the comma to specify a
range?). Option #3 looks OK on the surface, but it is actually the
worst of the three because it destroys interoperability: same XML
document may be considered correct by some parsers and erroneous by
others, depending on what locale the user happened to choose.
This is a very simple example; after you've worked this out, you can
start worrying about how to count combining characters with
field-length restrictions, etc.
All the best,
David
--
David Megginson david at megginson.com
http://www.megginson.com/
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list