SGML/XML 98 Paris, another take

Peter Murray-Rust peter at ursus.demon.co.uk
Wed May 27 22:36:13 BST 1998


At 09:29 26/05/98 +0001, Simon North wrote:
>Umm, was I at the same conference as Betty and Peter? Here's my take 
>on a couple of the items.

I don't want to start an extended discussion but I chose my words and
subjects carefully. If I think something isn't good I either say so or say
nothing. I also try to help us be as vendor-neutral as possible on this
list.  

>Microsoft: what I saw was an animated presentation of XML being used 
>a *wrapper* to carry the inter-Office package 'meta' data. The actual 
>data was unchanged, but when importing a chart plot into the Excel 
>spreadsheet, the data describing its size and other properties was in 
>XML. This is rather different to saving the whole thing in XML ... 
>however, on the other hand it opens up extremely interesting 
>possibilities for third-party add-on producers who could use the XML 
>data as part of a rudimentary API.

The point I was making - and it's an important one - was that the key
structural information inside the package had been cast into XML. The thing
that excited me - and still does - is that a multivariate table had been
marked up with multivariate structure. Although we had only a minute or so,
I am sure that given the output I could extract the table and transform it
- within an hour or so - into the input for cluster analysis, non-linear
least squares, etc.

People who *read* tables, rather than compute with them often do not
realise how little semantic information the common 'table' formats carry.
HTML tables are often columns of trees, DAGs, and goodness knows what else.
Tabular formatted data in RTF is worse - somewhere between Linear B and
Enigma to decipher. 

Clearly the whole output of Office is not - overnight - going to be
effortlessly syntactically, semantically and ontologically interoperable
with any piece of software. I imagine that the first step will be that I
can filter out the nuggets I want - tables, maths, molecules (ever seen a
molecule in RTF or PDF?) and process them with namespace-specific software.
Syntactic interoperability is a great first step.
>
>Netscape: [I am a Netscape 'fan'], I thought the presentation was a 
>little sad. Yes, we saw a version of 5 in action, incorporating a 
>browsable view of the local filesystem in a vertical left frame and 
>some neat tricks with bookmarks using RDF. However, much of the 
>presentation was a pep talk about Netscape's new open code policy - 
>with all sorts of name drops in the direction of Linux. Maybe I'm too 
>cynical, but the whole thing came across as more of a plea for help 
>than a ground-breaking demo. Oh, how the mighty are fallen.

I remember a few years ago when not only was source code secret, so were
*data formats*.  Perhaps Paris was not the first revelation of NS's open
code - but for me it was an appropriate celebration of it. 5 million lines
of code to help anyone on this list build their own XML browser sounds like
good news to me.
>
[...]
>
>Namespaces: no-one is happy about namespaces. Maybe I should leave it 

Tim has made the point succinctly, but I feel I need to amplify. I
deliberately chose not to report Jon's presentation for fear of corrupting
it, but I think it wrong to take it in a totally negative light.  I was
unaware of this debate until Jon's presentation and I am glad he felt able
to present it - remembering that not all WG/W3C matters are public.

>to Jon himself to speak his piece, but in the panel session on the 
>current status of the standards (SGML + XML) he said that XML was 
>almost stopped in its tracks by the W3C because other working groups 
>claimed that the XML group was not giving them what they needed. 
>Namespaces was more or less forced on them (he didn't actually say 
>the words 'ad hoc solution' but that was the flavour) and there are 
>a lot of problems with it.  

My take is this:
	We are entering a completely new area - semantic interoperability without
central control. It's exciting, highly worthwhile - and very challenging.
None of us knows the 'right' way to proceed because we have to mix humans
and technology. One the one hand we have the vision of documents which
never break, always reach their destination and on the other the lone
author with the freedom to publish their own ideas regardless of what the
world thinks. 

	SGML is clearly unable to provide semantic interoperability between
strangers (maybe the technology is there but it's too complicated). XML 1.0
had no support for semantics. The world wants semantics now and needs some
guidance. I think the XML-WG has made a good decision - it's the maximum
they could do without compromising the future. Lots of grand schemes have
had to be canned because they weren't universally acceptable.  

	Like many others, I am very happy about namespaces. I have been using them
for about 9 months in JUMBO1 and without them I couldn't have separated
molecular information from the rest. OK, I built a prototype system which I
have to knock down and rebuild, but it will now be built in a standard
fashion. I think the next 6 months will be times of semantic experiments
and I hope to see a number of them on this list.	XSD may provide a small
step forward - who knows. What is clear is that simple namespace processing
tools will be very valuable - and they clearly have to be very closely
integrated with the parser.

	I travel on the railway a lot and often compare what we are doing to what
our ancestors did 150 years ago. [In Paris we were next to the oldest
railway station which was built in a circle so the trains could turn
round.] XML is 4' 8.5" - no more - a standard gauge on which all our
rolling stock must run. Whether we run steam, diesel, electric, atmospheric
is not defined. Namespaces are nitroglycerine. Without its power no one
could have blasted through the mountain ranges. Initially it was highly
dangerous, but essential. Later Nobel tamed it with kieselguhr. It is much
less dangerous - but care is still needed. If anyone thinks they can today
cut-and-paste XML subsets between namespaces with abandon their documents
will explode with regularity. In a years time we shall have tamed them, so
long as we use the right tools and think about what we are doing.

	An important aspect of XML is its philosophy and discipline. There has
been a lot of discussion about namespaces - I have been privileged to be
part of  it. It was clearly *exceedingly* challenging. I think that people
though there could be a more extended acceptable solution and found it
really wasn't that easy. There are a lot of difficult issues like - does a
namespace carry locatability, ownership, semantic processibility, etc.?
many of these were discussed for the first time in a Web context.

	As a result of such discussions a protocol emerges. It is never perfect -
XML 1.0 is not perfect - but we are committed to making it work. This list
tries to be part of that process, by:
	- clarifying
	- prototyping and reporting
	- modularising
	- distributing implementations, etc.
I hope that over the next few months we can help make the namespace
proposal work - if not, the communal task will be that much harder.
>
>As soon as I get the time, I too will write up a full report.

Please do -  seriously. I have had a number of people who have said they
appreciate the report - and I wasn't there for the whole mtg. And don't be
afraid to be critical. I only went to three sessions and found them all
worth while. 
>
	P.

Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list