XML API specification

Peter Murray-Rust Peter at ursus.demon.co.uk
Wed Feb 26 22:35:44 GMT 1997

I'm really impressed with the contributions - it seems clear that there
are (rightly) some ambitious visions.

In message < at pop.intergate.bc.ca> Tim Bray writes:
> Fortunately, we're not starting from scratch.  We have two strawman
> I would propose seriously that Java be the basis of the first 
> cut at an API spec; it is really very pleasingly clean,
> and also has the virtue that ideas can be tested more or less
> instantly because there's running parser code to graft them
> onto. - Tim

I'd agree with this.  I think there is a case for a completely basic 
parser which builds from Lark/NXP and explores the basic issues that
PhaseI raises.  There are usually grey areas on a spec however well it 
is written (admittedly I haven't found any!) and it's useful to
have code to throw at trial documents.

Both Lark and NXP are excellent products and I could use either.  The main
question is what API they present and how I would use it.  From my point
of view the namespace for objects is critical and I suspect it isn't 
difficult, so here goes.

I wrote a parser , which I shall junk, which takes an SGMLStream, or an
ESISStream, and parses it into an Tree which contains Nodes.  (I actually call
them SGMLTree and SGMLNode, but I don't think this is useful).  Nodes have
an Attlist, which consists of Attributes.  This Tree is isomorphous to 
a Lark which contains Elements.

The Nodes are then subclassed (Java terminology, and I'd like a Java
namespace to evolve from this as well as an IDL - I don't know enough about
IDL yet).  The first subclass is DrawableNode (which manages whether the
children are visible, what the icon is etc.).  Any WF document reaches this

The next question is the DTD.  I create a DTD class (e.g. HTMLDTD, CMLDTD) 
whose role is to provide the information from the DTD to the application.
I think this should be strightforward but I'm not clear enough to be sure
what it should present.  At present mine simply contains GIs (someone
suggested ElementType was a better name.)  It should also contain all the
information in the normalised DTD, e.g. ContentModels, Attlists - what else
is needed?

My code works like this - when it reads a DOCTYPE it loads the Class for the
DTD (e.g. CMLDTD.class).  This has to be located somewhere ... let's assume
that is not a problem.  This class also has to load the classes to support the
ElementTypes.  So, for example, when the parser finds an Element with
ElementType 'MOL', it checks to see if MOL.class is part of CMLDTD, if it is
it loads it if necessary, and then it creates an Element of subclass
public class FOO extends DrawableElement {...} in my case

This is all fine, but it needs rewriting, and I would like to see the
namespace agreed reasonably soon.  I'm happy to go along with anything that
is suggested.  

The DTD class is (IMO) critical is we are going to have a validating editor
so that we can refer to individual Attlists, etc.  Does NXP (which validates)
contain the DTD as a subcomponent, for example.

On the general strategy, I think that it's important (though difficult) to
balance the needs of a global approach (using groves, etc.) and something
that implements PhaseI.  I'd like to argue for a PhaseI tool, partly 
because most of it is there in pieces, and partly because the rest is
at the mercy of the final spec.  There will also be some users of XML who 
are quite happy to stay with PhaseI software for their problems initially
and move on as they see the need.



Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences

xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa at ic.ac.uk)

More information about the Xml-dev mailing list