Is XDEV useful? (was re: XDEV proposals)
David Megginson
ak117 at freenet.carleton.ca
Sat Nov 22 13:44:05 GMT 1997
Peter Murray-Rust writes:
> The problems I address have the following generic operations.
> - I want to author XML. Ideally this should be human- and
> machine-readable. I want this process to be controlled by
> software/data to make it both flexible and rigorous. [This is
> tough, but I'm starting to address it in JUMBO. Practical help will
> be appreciated :-)].
I think that the latest version of Adept supports XML editing, and I
announced some patches to PSGML a couple of months ago.
> - I wish to be able to re-use other people's information
> objects. This is almost certainly going to break any DTD, but it is
> implicit in most of the current W3C activity. (RDF, MathML, XSL may
> have some sort of DTDs, but they will probably be used as
> components of larger documents, which cannot have DTDs)
Actually, this turns out not to be the case -- this is actually very
simple with XML in its current form, if you use XML as a data content
notation.
In the internal DTD subset:
<!ENTITY myrdf SYSTEM "myrdf.xml" NDATA xml>
<!ENTITY mymathml SYSTEM "mymathml.xml" NDATA xml>
<!ENTITY myxsl SYSTEM "myxsl.xml" NDATA xml>
In the external or internal DTD subset:
<!NOTATION xml PUBLIC "-//W3C//NOTATION eXtensible Markup Language//EN"
SYSTEM "http://www.w3.org/XML/">
<!ELEMENT externalDoc EMPTY>
<!ATTLIST externalDoc
doc ENTITY #REQUIRED>
In the document instance:
<para>Here is a a reusable RDF object:</para>
<externalDoc doc="myrdf">
<para>Here is a reusable MathML object:</para>
<externalDoc doc="mymathml">
<para>Here is a reusable XSL object:</para>
<externalDoc doc="myxsl">
Whenever your processing software finds an external data entity with
the XML notation, it can simply call the parser recursively.
You could also take an HTML-like approach (especially in a DTD-less
document), and simply do something like
<include src="myrdf.xml">
<include src="mymathml.xml">
<include src="myxsl.xml">
Again, just have your processing software call your parser recursively.
> - I wish to be able to manage distributed and multicomponent
> objects. I think XML and related disciplines will solve this very
> well and excitingly.
Exactly -- this is where the entity structure of full SGML and XML are
a big win.
> - I want to be able to validate XML 'objects'. XML can do this
> syntactically, but not semantically. For this I need additional
> 'recipes' and code
And you always will, no matter how XDEV is designed. I've implemented
SQL-based data management systems, and SQL's type checking is _never_
enough (or even close). Certainly we could modify XML so that parsers
could perform validations like
- the contents of this element must be a number
- the contents of this element must not be empty
but we'd just make the parsers bigger and wouldn't help much anyway.
After all, in real-world applications you always need to perform
validations along these lines:
- the contents of the element must be the name of an American city
with a population over 500,000
- the contents of the element must be a name mentioned in a list in
a different XML document
- the contents of the element must be a valid Internet domain name
I think that XML and SGML were smarter to leave all of this to the
application-specific processing software in the first place.
> - I want to be able to transform XML objects into other XML
> objects. XSL is tantalisingly close to being able to do this but I
> believe - at present - that W3C XML-transformation activity is
> 'undefined'.
Architectural forms will bring you part-way there. For one proposal, see
http://home.sprynet.com/sprynet/dmeggins/xml-arch.html
> - I want to be able to send XML objects to other people *with* a
> prior contract as how these are to be used. XML can partially solve
> this at present using DTDs, controlled prose and vocabularies and
> *bespoke applications* (i.e. a different application for each DTD.)
> This is as far as X*L goes. Much of the X*L prose stresses that
> particular activity is left to the *application*. This means that
> XML documents often need to be authored, knowing what application
> is going to be used to process them. This is, presumably, the way
> that CDF is designed - you have to have a 'CDF processor'. However
> it does not support *generic* applications (or even generic
> components of applications).
As, I think, Paul Prescod has noted, nothing but a Turing-complete
language could do this. XML is a method for creating applications --
it is not an application itself, and each application will need its
own conventions, etc.
> - I wish to be able to send hypermedia. XLL specifically declines to add
> any semantics to the syntax, other than an (implied) HTML-like behaviour
> for some of the SIMPLE links.
Are notations not suitable for specifying this information?
> - I wish to send objects to other people who will print them out
> and read them. XSL solves this.
Yes, it may. I wonder if document-viewing will end up being a major
XML application, when most of the effort right now seems to be going
into transactions and meta-data.
> - I wish to be able to send XML objects to people who I don't know exist,
> have never heard of me or my domain. [Example, a supermarket may need to
> hyperlink to molecular information in labelling its food products.] They
> need to access my semantics in (a) human-readable and (b) machine-readable
> form. For this a *generic* XML processor (or processing component) is
> required. This *is* achievable (through XSL) if the processing activity
> consists of producing 2D human-readable objects. I, and I suspect many
> others, want to be able to create generic XML applications. [JUMBO is a
> *generic* XML application - it can process any XML document. The degree of
> added value depends on the components made available by the document's
> author or domain.]
Here, again, architectural forms will help. As long as you use a DTD,
and the DTD implements a "food information" base architecture, the
supermarket will be able to incorporate your molecular information
automatically.
> Most of these issues are not being addressed, and probably will not be
> addressed by the current XML activity. [Not a criticism - they are doing a
> fantastic job. Their time is taken with deciding on precise syntax,
> procedures, meaning of components in XML documents, etc. More difficult
> than I think a lot of people realise.]
I agree.
> This is where XML-DEV has a role to play. Not formally - this list has no
> standing other than the high quality of its postings. Since many of these
> areas will give rise to 'colliding ontologies' (i.e. strongly held views on
> how to do things and what things mean) there are no single solutions.
> However, if we treat this in the spirit of a biological system, 'fit'
> solutions should arise.
[remainder omitted]
XML-DEV would provide simple solutions to a few additional simple
problems, but in the end (as with SQL), people will still have to do a
lot of work in the middleware. I cannot usefully dump the SQL tables
from my database and send them to someone else without a lot of
integration and customisation work, unless we planned our tables
together from the start.
All the best,
David
--
David Megginson ak117 at freenet.carleton.ca
Microstar Software Ltd. dmeggins at microstar.com
http://home.sprynet.com/sprynet/dmeggins/
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list