Why XML data typing is hard
Anders W. Tell
anderst at toolsmiths.se
Mon Nov 30 15:34:12 GMT 1998
david at megginson.com wrote:
> Michael Kay writes:
>
> > "4,50" is a localized rendition of a float value. But in XML we
> > should encourage a rendition-independent encoding of information.
>
> That is one of the biggest problems with applying concepts from data
> storage to syntax. XML *is* pure external representation -- in a
> database, I can take any of the following appropriate to my locale and
> store it internally as the same bunch of bits:
>
> 4,5
> 4,50
> 4.5
> 4.50
> 004.500
>
> When I want to render that bunch of bits, I can pick any appropriate
> rendition based on the user's locale and formatting requests (for
> example, the user might have typed "4.50" into a field in a form, but
> the report for a French user might show "04,5").
>
> With XML, though, it is the representation itself that I'm exchanging,
> not the abstract data (though perhaps in the future people might want
> to pass around compiled DOM trees -- who knows?). That means that if
> I put
>
> <balance>4.50</balance>
>
> and send the document to a French user, the French user will still see
> the strange, foreign
>
> <balance>4.50</balance>
>
> There's not a general-purpose, locale-independent way of storing it.
Why not ? The example above is your preferred interpretation of XML and
not how XML *may* be used. I personally accept many other views of XML.
Its up to the creator and the user of XML information to agree on the
interpretation.
One Design Pattern that I stongly support is the separation of Data and its
Presentation
and in the above example its the presentation that is sent, which is OK if the
user is a person but it the user an another computer its much better to send
the actual data (with formatting instructions if needed)
This is especially important in business to business communication that the
interpretation
of information is not uneccessary complex.
> You could define a local-independent text representation of a
> floating-point number, but then you're just adding yet another
> representation to the list.
Yes but having a *standard* data representation is making life easier for
everybody.
=> Less complexity
=> smaller application logic
=> Faster parsing, validation
>
> Please note that I'm *not* opposed to data typing: I think that it's
> necessary and that we will see something like it sooner or later. I
> know that the XML Schema WG plans to work hard on data typing and that
> since there are many good, talented people in that WG, they are very
> likely to surprise us; in the mean time, however, I just want to
> emphasise that [1] the problem is not easy when you move beyond a
> specific locale and/or application domain (regular expressions won't
> cut it for usability), and that
Yes, but this applies only to Presentation and not Data.
In most instances a number is a number is a number...
> [2] the solution will likely provide less functionality than many people
> expect.
Same a above comment.
I think Mike Kay has brought up many good points and I second the motion
of a *Data* standard ontop of XML.
Best
Anders
--
/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
/ Financial Toolsmiths AB /
/ Anders W. Tell /
/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list