Is XDEV useful? (was re: XDEV proposals)

Peter Murray-Rust peter at ursus.demon.co.uk
Sat Nov 22 15:59:46 GMT 1997


At 08:44 22/11/97 -0500, David Megginson wrote:

Thanks very much David,
	You pose - but do not answer - a question :-). 

>Peter Murray-Rust writes:
>
[...]
>
>I think that the latest version of Adept supports XML editing, and I
>announced some patches to PSGML a couple of months ago.

Indeed. I have no doubt that are and will be some excellent commercial
tools. My problem, which I think is not unique, is that I cannot persuade
my (often conservative) colleagues in science to start using a new
discipline if there is a significant entry cost in terms of tools. [I do
not remember how much Adept is, but many SGML tools are beyond the reach of
impecunious individuals :-)]. I also want to be able to customise the tools
I work with, and - for example - to link in the conversion of legacy data
'on the fly'. 

I did indeed note your posting on EMACS/pSGML, and thought about
downloading it. But there were dire warnings about 'if you aren't fully
familiar with major modes of EMACS don't try this...' that I didn't :-)

>
> > - I wish to be able to re-use other people's information
> > objects. This is almost certainly going to break any DTD, but it is
> > implicit in most of the current W3C activity. (RDF, MathML, XSL may
> > have some sort of DTDs, but they will probably be used as
> > components of larger documents, which cannot have DTDs)
>
>Actually, this turns out not to be the case -- this is actually very
>simple with XML in its current form, if you use XML as a data content
>notation.
>
>In the internal DTD subset:
>
>  <!ENTITY myrdf SYSTEM "myrdf.xml" NDATA xml>
>  <!ENTITY mymathml SYSTEM "mymathml.xml" NDATA xml>
>  <!ENTITY myxsl SYSTEM "myxsl.xml" NDATA xml>
>
>In the external or internal DTD subset:
>
>  <!NOTATION xml PUBLIC "-//W3C//NOTATION eXtensible Markup Language//EN"
>                 SYSTEM "http://www.w3.org/XML/">
>  <!ELEMENT externalDoc EMPTY>
>  <!ATTLIST externalDoc
>    doc       ENTITY        #REQUIRED>
>
>In the document instance:
>
>  <para>Here is a a reusable RDF object:</para>
>  <externalDoc doc="myrdf">
>  <para>Here is a reusable MathML object:</para>
>  <externalDoc doc="mymathml">
>  <para>Here is a reusable XSL object:</para>
>  <externalDoc doc="myxsl">
>
>Whenever your processing software finds an external data entity with
>the XML notation, it can simply call the parser recursively.

This is very clever! Thanks for pointing this out. I wouldn't have thought
of it. 
It does, however, require that each ENTITY consistently uses just one DTD.

>
>You could also take an HTML-like approach (especially in a DTD-less
>document), and simply do something like
>
>  <include src="myrdf.xml">
>  <include src="mymathml.xml">
>  <include src="myxsl.xml">

This is indeed what I do at present - but using XML-LINK specifically. 
<ITEM XML-LINK="SIMPLE" HREF="myrdf.xml" SHOW="EMBED" ACTUATE="AUTO">

(although the semantics of EMBED - just like SRC - may not be universally
agreed.)
>

>
> > - I want to be able to validate XML 'objects'. XML can do this
> > syntactically, but not semantically. For this I need additional
> > 'recipes' and code
>
>And you always will, no matter how XDEV is designed.  I've implemented
>SQL-based data management systems, and SQL's type checking is _never_
>enough (or even close).  Certainly we could modify XML so that parsers
>could perform validations like
>
>  - the contents of this element must be a number 
>  - the contents of this element must not be empty
>
>but we'd just make the parsers bigger and wouldn't help much anyway.
>After all, in real-world applications you always need to perform
>validations along these lines:
>
>  - the contents of the element must be the name of an American city
>    with a population over 500,000
>  - the contents of the element must be a name mentioned in a list in
>    a different XML document
>  - the contents of the element must be a valid Internet domain name

My approach to this is to write Element-specific code which is activated at
various processing times, e.g. Atom.process(). [I also have a
Atom.display()] This, of course, implies that the validation (or display)
of the element is context-independent, but I'm optimistic that - for the
sort of things I'm interested in - that will be true. I can easily see:
	Float.validate();
	Molecule.validate();
	Table.validate();
	URL.validate();
being standalone functions and re-usable in different environments. They
can also easily be overridden at the same stages as stylesheets. Your first
two examples are admittedly context-dependent.


>I think that XML and SGML were smarter to leave all of this to the
>application-specific processing software in the first place.

Agreed. I think one role of XML-DEV is to see what agreement(s) are
possible for the next step.

>
> > - I want to be able to transform XML objects into other XML
> > objects. XSL is tantalisingly close to being able to do this but I
> > believe - at present - that W3C XML-transformation activity is
> > 'undefined'.
>
>Architectural forms will bring you part-way there.  For one proposal, see
>
>  http://home.sprynet.com/sprynet/dmeggins/xml-arch.html

I have read - and appreciated this. I think that, without having an
AF-aware processor to hand, and a friendly guru, it's too difficult for
*me*. And certainly for my community. But I know there are a lot of
devotees of AFs on this list, and perhaps they can come to a communal view
as to whether there is agreement as to how they are to be used in XML and
what software is required (because they do need software).

>
> > - I want to be able to send XML objects to other people *with* a
> > prior contract as how these are to be used. XML can partially solve
> > this at present using DTDs, controlled prose and vocabularies and
> > *bespoke applications* (i.e. a different application for each DTD.)
> > This is as far as X*L goes. Much of the X*L prose stresses that
> > particular activity is left to the *application*. This means that
> > XML documents often need to be authored, knowing what application
> > is going to be used to process them.  This is, presumably, the way
> > that CDF is designed - you have to have a 'CDF processor'. However
> > it does not support *generic* applications (or even generic
> > components of applications).
>
>As, I think, Paul Prescod has noted, nothing but a Turing-complete
>language could do this.  XML is a method for creating applications --
>it is not an application itself, and each application will need its
>own conventions, etc.

Well, I'm probably mad. But I still feel that (at least parts of) an
XML-processor can be document-independent.

> > - I wish to be able to send hypermedia. XLL specifically declines to add
> > any semantics to the syntax, other than an (implied) HTML-like behaviour
> > for some of the SIMPLE links.
>
>Are notations not suitable for specifying this information?

I don't know :-). I have never used NOTATION. Seeing your example above
suggested that it may be useful. Maybe it will add type information to the
thing pointed at?

 XLL states that there is an attribute 'BEHAVIOR' but says nothing about
what it is for.  It would be valuable (as I have already posted) if there
is some consensus about the values and their meaning.
>
> > - I wish to send objects to other people who will print them out
> > and read them. XSL solves this.
>
>Yes, it may.  I wonder if document-viewing will end up being a major
>XML application, when most of the effort right now seems to be going
>into transactions and meta-data.

I think the definition of 'document' will effectively broaden. I see no
reason why non-textual objects cannot be regarded primarily as 'documents'.
>
[...]
>
>Here, again, architectural forms will help.  As long as you use a DTD,
>and the DTD implements a "food information" base architecture, the
>supermarket will be able to incorporate your molecular information
>automatically.

Ah - but this is the problem. I have no idea who will use my information
and that is why I think that AFs are limited in my area. In Java classes,
for example, I can use the Date class without the authors knowing I exist.
I hope that others can use my Molecule class/element in the same way. 


>
>XML-DEV would provide simple solutions to a few additional simple
>problems, but in the end (as with SQL), people will still have to do a
>lot of work in the middleware.  I cannot usefully dump the SQL tables
>from my database and send them to someone else without a lot of
>integration and customisation work, unless we planned our tables
>together from the start.

No question. Maybe I think there will be a lot of newcomers with simple
problems to which there will be simple solutions. Just as there were with
HTML. Maybe I'm wrong :-), and that most of the problems will have to map
onto very thoroughly worked out solutions on a per-problem basis. We'll see.

	P.

Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list