XML and standards (was Re: Integrity in the Hands of the Client)

Paul Prescod papresco at technologist.com
Thu Nov 27 14:52:30 GMT 1997

Mark Baker wrote:
> On Mon, 24 Nov 1997, Paul Prescod wrote:
> > > What if that troff document contained a link to an implementation of a
> > > troff formatter?  What if that implementation described its interface using
> > > XML?
> >
> > What if it didn't? What if it described its interface using CORBA or
> > some proprietary language that is more powerful than CORBA? You don't
> > lose any flexibity or expressive power, you just have to write another
> > parser for CORBA or your proprietary language.
> My point is that if it did, then no longer are clients responsible for
> interpreting the semantics of the data - a contained/referenced
> implementation is.

Well at the hardware level, it is still the client. I think you are
distinguishing between clients being hard-wired to accept a fixed number
of notations and being extensible (e.g. through Java). That sounds
> In comp doc frameworks, when a new stream of data is introduced into a
> container, the framework decides the type of the data and then attempts
> to find an editor based on that type.  The editor knows what to do with
> that data, and negotiates with the container for the real-estate for its
> presentation.

I think this is more tricky then it sounds, especially that bit about
"negotiating for real estate" (unless you are talking about unit
squares). But okay.

> So if a well-formed document comes streaming into our container, the
> framework would start parsing it, come across a tag called 'troff', and
> then proceed to try and discover and install a chunk of code that knows
> how to parse/render troff.  Or the document could provide its own ref(s)
> (more likely for scalability purposes).  Either way, it's not the
> container (the client) that's responsible for interpreting the semantics
> of the data.  It's the document itself that is responsible.

You seem to be arguing in favour of self-labelling data formats, which I
agree could be quite useful. But XML doesn't give you that "for free" in
any sense. There is no standard for having XML documents, entities or
elemenets link to Java Beans or Active-X controls that can render them.
You must invent such a standard and it will be only marginally easier to
invent an XML-based one than to use OpenDoc or OLE Structured Storage
which handle this already. XML has the benefit that it has momentum
today and may "take over the universe." It has the serious downside that
it cannot (reasonably) encode binary information so .GIFs and .JPEGs
cannot be self-describing in this way (whereas they could be in OpenDoc
Bento or OLE Structured Storage).

In other words, something like Bento or OLESS is probably still needed.
We could surely find a way to recreate it with XML and (e.g.ZIP), but it
seems to me that that would be more of a political decision than a
technical one. The SGML standards family has something called "SDIF".
There is also mime/multipart, Amiga IFF and probably a hundred other
kicks at this can.

Anyhow, I think that a high priority of the XML WG/Community should be
inventing the XML equivalent of the JAR file. It is way too much of a
hassle to ship multipart documents (whether they be SGML, HTML or XML).
It needent be much harder than shipping around Word Docs (which are
really multipart documents). This XAR files should be able to label
their contents.

 Paul Prescod

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

More information about the Xml-dev mailing list