Why XML data typing is hard

Joel Bender joel at spooky.emcs.cornell.edu
Tue Dec 1 21:01:02 GMT 1998


Ketil Z Malde wrote:

>  Alternatively, you could force people to use "YYYY-MM-DD" by
>  forcing conformance to a regular expression, and have your
>  applications only have to deal with that.

It may not be politicaly correct, and depending on the context might even
come across as ethnocentric, but IMHO that's not a bad thing.

>  And, I think it's pretty obvious that there are a lot of very
>  complex data types out there.  What's the format for version
>  numbers, for instance?  Or license plates?  Are you ready to
>  come up with an xml:type that covers all cases?

A standards process doesn't need to cover all the cases, all it has to do
is come up with a consensus on how the basics should be covered, and
assuming the membership keeps its collective head above water, how it
should be extended.

>  Some may want to build all of this into a type system that
>  XML parsers need to handle...

I don't think a type 'system' is necessary, but a way of mapping patterns
and types together is useful enough for lots of applications to be
standardized.  Let's say you give me a bunch of XML files which is are
marked-up email messages, and I would like to find out which ones are at
least a week old.  It sure would be nice to know that the <received>Tue, 1
Dec 1998 02:00:09 +0000</received> contents you provided me have some
standard form.

>  ...with mappings to the various programming languages and
>  machine architectures that may or may not support that type
>  natively.

No, not specific to a language mapping, that belongs in some API or SAX
reference not in XML.  Supporting grep content pattern matching doesn't
seem like it would be any more difficult than namespaces, kinda like...

	dataPattern ::= 'xml:grep' Eq Pattern	[ VC : Matches Pattern ]
	Pattern ::= (to be defined)

	Validity Constraint : Matches Pattern

	If the 'xml:grep' attribute has been provided then the
	element contents must match the pattern.

Besides, I don't know of any machine architecture that 'knows' about
anything 'natively' other than a bunch of one's and zero's :-).


Joel

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list