Why XML data typing is hard

Ketil Z Malde ketil at ii.uib.no
Tue Dec 1 11:49:44 GMT 1998


"Anders W. Tell" <anderst at toolsmiths.se> writes:

>> There's not a general-purpose, locale-independent way of storing it.

> Why not? 

Actually, there is.  I believe IEEE 754 describes one.  That doesn't
mean it's a good idea to use it in XML documents.

> Its up to the creator and the user of XML information to agree on the
> interpretation.

In my ears, it sounds like a designer of an XML system should be able
to define what elements are allowed to contain, as well as how
applications deal with that content.

> One Design Pattern that I stongly support is the separation of Data
> and its Presentation 

I'm afraid I'm not quite sure what that means.  Could I ask you to
elaborate?   

> and in the above example its the presentation that is sent,

How can you *not* send presentation?  And again, how do you know that
e.g. 4.2, a number binary representations of reals only approximate,
isn't the actual value?

> which is OK if the user is a person but it the user an
> another computer its much better to send the actual data (with
> formatting instructions if needed)

So you *do* want to embed IEEE floats in what used to be nice,
readable and simpe-to-use text documents?

<value xml:type="IEEE 754 32 bits" xml:printf="%1.2f">æðßð</value>

I have a feeling I must be serverely misunderstanding something, since 
this obviously is less desirable than

<value>4.5</value>

> This is especially important in business to business communication
> that the interpretation of information is not uneccessary complex.

While I agree on this, I don't see why you think "4.5" is complex,
compared to a binary format.  It is readable, it is trivial and
unambigous to read for humans *and* for computers, once the
conventions - which would be embedded in the DTD - are agreed upon. 

> Yes but having a *standard* data representation is making life easier for
> everybody.

I fail to see how:

> => Less complexity

Parser would need to know about all the types that are thrown around,
they would need to know how to read them in any language and any
character set, and they would need to know how to represent them in
any programming language.  They would need to know how to deal with
byte ordering, arbitrary hardware limitations and quaint binary
formats like float representation.

As far as I can see, nobody have suggested how this is to be worked
around or overcome.

> => smaller application logic

You require the application to understand all kinds of types, instead
of just the ones relevant for the document being parsed.

> => Faster parsing, validation

Why faster with a lot of extra baggage?  I got the impression XML
parsing is fast right now, because parsers generally don't have to
keep track of all kinds of crud inside elements and attributes, they
just accept whatever's there and breeze along.

I get the feeling you're looking for CORBA, not XML.

> Yes, but this applies only to Presentation and not  Data.
> In most instances a number is a number is a number...

Not on a binary level, and not on any kind of hardware or in any kind
of programming language.  Most definitely not.

I realize I don't understand all the issues involved, which I suppose
is why I find the need for XML data types so hard to grasp.  But I'm
very worried about what used to be a very simple (to understand and to
program for!) standard evolve into terminal featuritis that never get
completely implemented.

~kzm
-- 
If I haven't seen further, it is by standing in the footprints of giants

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list