Integrity in the Hands of the Client

Peter Murray-Rust peter at
Sat Nov 22 09:50:25 GMT 1997

At 17:26 22/11/97 +1100, Rick Jelliffe wrote:
>It is just false that SGML (the family of technologies: ISO 8879, ISO 10774,
>ISO 9070, etc) does not provide a way to use regular expressions (or any
>other syntax you choose) to provide models for data.  The lexical typing
>facilities have been on the books for 5(?) years now, and have just been
>overhauled in HyTime '97 standard. However, because SGML systems do not 
>have to provide it to be conforming, few have, as part of their standard
configuration, so far.  XML has taken exactly the same
>road as SGML  
>and left more useful data validation to the application to take care of.

We are at a very exciting, but critical, time in the development of XML and
I am very heartened by the quality and amount of debate on this list. I
sense that there is a steady influx of people who have had little or no
exposure to 'traditional' SGML and are discovering its power and
limitations in an empirical manner :-) [If so, I have particular empathy,
as I come from outside the SGML community and have never created an SGML
document for 'production' purposes.]

XML will be used by vastly more people that current practise SGML. That is
both liberating and a cause for concern. It's certainly likely that useful
methods already developed in SGML will often not be used simply because
people don't know about them. Similarly there are often standards in other
disciplines which map directly onto XML problems. Where possible they
should be used.

In many cases the XML specs (including XLL and XSL) deliberately do not say
how something should be done - only what syntax should be used. The WG has
(often rightly) taken the view that it should not prescribe ways of doing
things. But we are not at - or very near to - the time when people will
start doing things and there is a danger that we shall end up with serious
inconsistencies. For example, when Britain first invented and developed
railways there were two gauges (4' 8.5", and 8') and Baker Street station
in London had both. Australia had (?5) and I gather is only now
rationalising them (Rick?). As an example, if we use DATEs in XML I think
we need a good reason not to use ISO 8601. 

It is clear that there is overwhelming demand for some datatyping in XML.
For example, I am now extending JUMBO as an authoring tool and I want to be
able to control the type and validity of both attributes values and PCDATA
content. Obviously I can invent my own rules, but I'd prefer to use
something that other people have already agreed on.  I can't do this in a
DTD, but I think I *can* do it consistently with (and in the spirit of)
SGML. [Very simply - I'll expand later - I am developing a per-element
'schema' in XML syntax which encapsulates the DTD approach and enhances it.
As is my spirit, I'm keeping it simple - not adding the complexities of
inheritance as in the XML-data approach.] At present my datatypes are:
	FLOAT (or synonym)
and I'd value comments. [Any new items need code to be written, so they
don't come free :-)]

This almost inevitably leads on to data validation and I'd like to know
what syntax people already have for expressing this. Obviously it would be
nice for it to be XML-compatible.


I have had some positive feedback on the idea of XDEV and I shall try to
reformulate my ideas. It's very clear that we need a way of discussing the
'land beyond syntax'.  I liked the phrase 'when ontologies collide' which I
saw recently (I think from a pointer from Robin Cover's page) and this
seems to me an area where XML-DEV can play an important role. At least we
may be able to identify the ontologies :-)


Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS, Virtual Hyperglossary

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list