XML-DEV (was Re: Open Standards Processes) [longish]

Peter Murray-Rust peter at ursus.demon.co.uk
Sun Apr 26 17:15:13 BST 1998

Since XML-DEV is a bit over a year old and because of though-provoking and
constructive discussion at present and because of the success of SAX, here
are some ramblings. A year is a long time and many XML-DEV members were not
subscribed when we started, so my rosy-coloured recollections are given. [A
lot of the specifics can be generalised to the issues we are addressing now.]

We all owe a great deal to Henry Rzepa. [BTW Henry *does* read your mail,
and does unsubscribe people, etc. He is just very busy. While I can get a
thrill out of writing stuff like this, Henry is getting less of a thrill
trying to sort out actually what mail address someone thinks they are
subscribed to; why sending 'unsubscribe' to xml-dev doesn't work, etc.]

Henry and I are molecular scientists - chemist and crystallographer
respectively. He had the vision - perhaps about 7-10 years ago - of the
*power of the electronic medium to communicate molecular ideas*. [The
phrase is mine - he may disagree :-)]. We formed a symbiotic relationship -
we undertake parts of an activity which are complementary. The fundamental
aspect of most of what we do is to use the power of the information
revolution to create a new way of communicating molecular science.  XML-DEV
is part of that.

Molecular science (like many other disciplines that XML-DEVers will be
familiar with) is well-established, with a large information industry and a
fragmented approach. We have few formal standards for the communication of
chemistry and syntactic mismatch is extremely common. Moreover the
semantics of chemistry are very wide - Linus Pauling was probably the last
person ever to be able to formulate them.

Henry pioneered the use of the WWW for chemistry many ways - development of
interactive tools for molecular and spectral display - and has run 3 major
e-conferences. Some years ago he had the idea of using MIME to convey
chemical information. The two of us spent $10 in a Greenford pub and - with
Ben Whittaker - drafted a submission to the IETF for chemical/x-*.

This was immediately successful in that the molecular community adopted
this almost overnight (ca. 2 months). This - I think - is the ideal that
some of us are searching for in current discussions - can we develop
something for $10 in an open process that does exactly what the community
wants. I use the word 'meme' (from Richard Dawkins - the Selfish Gene) to
describe this; it's a self-reproducing idea.  Since the WWW is a marvellous
incubator for memes, they are very attractive to develop and I believe
XML-DEV has and will continue to create them.

A meme must have the properties:
	- it must be rapidly (i.e. minutes) obvious what its point is.
	- a sufficient percentage of the population must be infectable
	- the energy required to transmit it must be low compared with its value.
	A good idea of a meme on the WWW is the 'Home Page'. No committee decided
that there should be home pages - but they are self-evidently valuable to a
large percentage of us. FAQs are another. 

I have been searching for many years for the means to convey my ideas (in
molecular software). It became clear that with SGML(sic) and the WWW these
had arrived and I developed costwish to this end (i.e. a general SGML tool
with the specific intention of promoting Chemical Markup Language). But it
was very clear that *ML ideas were not going to spread rapidly without
portable software, and SGML was not an effective meme (expensive and
cumbersome). So when XML arrived I know that the 'right solution' was there
- it was a question of how to make sure it got to a stage where the
molecular community saw the value and the need to adopt it. 

I did not expect the molecular community to be in the vanguard of those
developing XML solutions and on the whole this is true. The exceptions come
from those areas which are already involved with largescale *document*
management such as (a few) publishers, patents and regulatory. A key
problem in many sciences is the separation of 'documents' from 'data'. The
data industry is not used to using *ML approaches, while the document
industry usually regards 'data' as no different from any other pixel-based
rendering (i.e. semantic content is discarded). XML offers the enormous
excitement of creating unified documents and data - and could revolutionise
the process of scientific publishing if the communities have the courage.
[My own - crystallography - has; it has developed e-publications in which
documents and data are combined, but this is very rare.]

So the motivation for XML-DEV was to help the generic XML effort succeed.
This was by no means certain when XML started. If XML were to have any
meme-like qualities it would simple believable software, besides a
convincing spec. In my experience it is far easier to write specs than to
implement them and I have seen many which have never been effectively
implemented. I didn't want this to happen with XML and so H and I offered
this list as a means of helping the communal software development process.
We have, of course, been very gratified by the large number of people who
have announced freely or easily available software here. There are times
when the software has had an important effect in highlighting aspects of
the spec - for example the difficulty of implementing some of the original
PE syntax.

Another concern that we had was the likelihood of incompatible
implementations. This could have been through misinterpretation of the spec
or simply the lack of suitable test apparatus. [BTW I hope that JUMBO2 can
act as a simple test apparatus as it can run any SAX-compliant parser]. I
have always suggested that XML-DEV should take a lead in aiming for
re-usability, compatibility, etc. I am delighted, of course, with DavidM's
achievement with SAX - certainly much larger that it seen ed at the start.

It is clear that XML is now a believable approach, and it's also clear that
- possibly for the first time - difficult aspects of semantics, namespaces,
etc. are having to be addressed in public. If XML-DEV can help in this -
splendid. If some of these require different organisations, fine. I am now
much less concerned than I was a year ago about XML being all talk and no
implementation, but I think we always have to remember implementation in
most of what we post. Seemingly obvious things can be very difficult to
implement - PEs, namespaces have shown that. I suspect that the
parser-application relationship will still need a lot of work.

Finally. Henry and I heard TimBL talk at WWW1 (CERN) and the formation of
(what is now) the W3C was first floated there. It seemed very idealistic,
free, new-frontier-like, with talk of 'an electronic bill of rights', etc.
Remember at that stage that very little commercial had hit the WWW - most
pages were from scientists, a few orgs and IT coms. We didn't know what the
W3C would look like. Things then went quiet for some while until we have
the W3C as we know it.

I sympathise with all the views expressed. I'm an idealist, and when it has
been possible to create a $10 meme it's marvellous. Henry and I are
planning a follow-up. I've had the same experience with the Globewide
Network Academy - a group of enthusiast volunteers (many are graduate
students) who see the power of the electronic age with a clarity and
confident some of us miss. They have done extraordinary things - just with
a few server-side resources and boundless energy. However, these successes
are rare and most real-world creations require time, money and a lot of
paid dedication. 

Compared to most 'standards' processes I have found the W3C process for XML
and related matters extremely impressive and refreshing. OK, I am partially
involved as an XML-SIG member but I understand the frustration of those who
do not know what is being discussed. But I do value the way that the
XML-SIG has chosen people for their expertise rather than their formal
institutional standing [I was unemployed]. I also think that - in XML -
there has been relatively little so far that has been a major shock when
published in draft. My understanding is that drafts are available
internally 1 month before being made publicly available, though obviously
discussions can be confidential for somewhat longer. IMO the speed and
rigour of the process is very impressive and if this is offset against
delay in pre-publication I think it's worth it. If as an individual you
want to become involved in shaping the W3C discussions, it probably helps
if you have shown expertise publicly, and created tools of benefit to the
community. FWIW Henry and I spent a lot of time presenting our case on the
MIME discussion list, without formal success :-). 

I appreciate the problem that XML-writers find themselves in (I may, or may
not, be writing an XML book). It's actually a commentary on the outdated
nature of the paper-publication process that people have to plan for
publication before they know what they are writing about (in detail).
E-publication need not suffer  from the same constraints - thus (if you
regard it as an e-publication) the JUMBO2 CDROM will have the latest, final
version of SAX 'hot off the press'. 


Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

More information about the Xml-dev mailing list