Is XDEV useful? (was re: XDEV proposals)

David Megginson ak117 at freenet.carleton.ca
Sat Nov 22 13:44:05 GMT 1997


Peter Murray-Rust writes:

 > The problems I address have the following generic operations.

 > - I want to author XML. Ideally this should be human- and
 > machine-readable. I want this process to be controlled by
 > software/data to make it both flexible and rigorous. [This is
 > tough, but I'm starting to address it in JUMBO. Practical help will
 > be appreciated :-)].

I think that the latest version of Adept supports XML editing, and I
announced some patches to PSGML a couple of months ago.

 > - I wish to be able to re-use other people's information
 > objects. This is almost certainly going to break any DTD, but it is
 > implicit in most of the current W3C activity. (RDF, MathML, XSL may
 > have some sort of DTDs, but they will probably be used as
 > components of larger documents, which cannot have DTDs)

Actually, this turns out not to be the case -- this is actually very
simple with XML in its current form, if you use XML as a data content
notation.

In the internal DTD subset:

  <!ENTITY myrdf SYSTEM "myrdf.xml" NDATA xml>
  <!ENTITY mymathml SYSTEM "mymathml.xml" NDATA xml>
  <!ENTITY myxsl SYSTEM "myxsl.xml" NDATA xml>

In the external or internal DTD subset:

  <!NOTATION xml PUBLIC "-//W3C//NOTATION eXtensible Markup Language//EN"
                 SYSTEM "http://www.w3.org/XML/">
  <!ELEMENT externalDoc EMPTY>
  <!ATTLIST externalDoc
    doc       ENTITY        #REQUIRED>

In the document instance:

  <para>Here is a a reusable RDF object:</para>
  <externalDoc doc="myrdf">
  <para>Here is a reusable MathML object:</para>
  <externalDoc doc="mymathml">
  <para>Here is a reusable XSL object:</para>
  <externalDoc doc="myxsl">

Whenever your processing software finds an external data entity with
the XML notation, it can simply call the parser recursively.

You could also take an HTML-like approach (especially in a DTD-less
document), and simply do something like

  <include src="myrdf.xml">
  <include src="mymathml.xml">
  <include src="myxsl.xml">

Again, just have your processing software call your parser recursively.

 > - I wish to be able to manage distributed and multicomponent
 > objects.  I think XML and related disciplines will solve this very
 > well and excitingly.

Exactly -- this is where the entity structure of full SGML and XML are
a big win.

 > - I want to be able to validate XML 'objects'. XML can do this
 > syntactically, but not semantically. For this I need additional
 > 'recipes' and code

And you always will, no matter how XDEV is designed.  I've implemented
SQL-based data management systems, and SQL's type checking is _never_
enough (or even close).  Certainly we could modify XML so that parsers
could perform validations like

  - the contents of this element must be a number 
  - the contents of this element must not be empty

but we'd just make the parsers bigger and wouldn't help much anyway.
After all, in real-world applications you always need to perform
validations along these lines:

  - the contents of the element must be the name of an American city
    with a population over 500,000
  - the contents of the element must be a name mentioned in a list in
    a different XML document
  - the contents of the element must be a valid Internet domain name

I think that XML and SGML were smarter to leave all of this to the
application-specific processing software in the first place.

 > - I want to be able to transform XML objects into other XML
 > objects. XSL is tantalisingly close to being able to do this but I
 > believe - at present - that W3C XML-transformation activity is
 > 'undefined'.

Architectural forms will bring you part-way there.  For one proposal, see

  http://home.sprynet.com/sprynet/dmeggins/xml-arch.html

 > - I want to be able to send XML objects to other people *with* a
 > prior contract as how these are to be used. XML can partially solve
 > this at present using DTDs, controlled prose and vocabularies and
 > *bespoke applications* (i.e. a different application for each DTD.)
 > This is as far as X*L goes. Much of the X*L prose stresses that
 > particular activity is left to the *application*. This means that
 > XML documents often need to be authored, knowing what application
 > is going to be used to process them.  This is, presumably, the way
 > that CDF is designed - you have to have a 'CDF processor'. However
 > it does not support *generic* applications (or even generic
 > components of applications).

As, I think, Paul Prescod has noted, nothing but a Turing-complete
language could do this.  XML is a method for creating applications --
it is not an application itself, and each application will need its
own conventions, etc.

 > - I wish to be able to send hypermedia. XLL specifically declines to add
 > any semantics to the syntax, other than an (implied) HTML-like behaviour
 > for some of the SIMPLE links.

Are notations not suitable for specifying this information?

 > - I wish to send objects to other people who will print them out
 > and read them. XSL solves this.

Yes, it may.  I wonder if document-viewing will end up being a major
XML application, when most of the effort right now seems to be going
into transactions and meta-data.

 > - I wish to be able to send XML objects to people who I don't know  exist,
 > have never heard of me or my domain. [Example, a supermarket may need to
 > hyperlink to molecular information in labelling its food products.] They
 > need to access my semantics in (a) human-readable and (b) machine-readable
 > form. For this a *generic* XML processor (or processing component) is
 > required. This *is* achievable (through XSL) if the processing activity
 > consists of producing 2D human-readable objects. I, and I suspect many
 > others, want to be able to create generic XML applications. [JUMBO is a
 > *generic* XML application - it can process any XML document. The degree of
 > added value depends on the components made available by the document's
 > author or domain.]

Here, again, architectural forms will help.  As long as you use a DTD,
and the DTD implements a "food information" base architecture, the
supermarket will be able to incorporate your molecular information
automatically.

 > Most of these issues are not being addressed, and probably will not be
 > addressed by the current XML activity. [Not a criticism - they are doing a
 > fantastic job. Their time is taken with deciding on precise syntax,
 > procedures, meaning of components in XML documents, etc. More difficult
 > than I think a lot of people realise.]

I agree.

 > This is where XML-DEV has a role to play. Not formally - this list has no
 > standing other than the high quality of its postings. Since many of these
 > areas will give rise to 'colliding ontologies' (i.e. strongly held views on
 > how to do things and what things mean) there are no single solutions.
 > However, if we treat this in the spirit of a biological system, 'fit'
 > solutions should arise. 

[remainder omitted]

XML-DEV would provide simple solutions to a few additional simple
problems, but in the end (as with SQL), people will still have to do a
lot of work in the middleware.  I cannot usefully dump the SQL tables
from my database and send them to someone else without a lot of
integration and customisation work, unless we planned our tables
together from the start.


All the best,


David

-- 
David Megginson                 ak117 at freenet.carleton.ca
Microstar Software Ltd.         dmeggins at microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list