Peter Murray-Rust peter at
Sun Dec 28 12:48:32 GMT 1997

At 07:42 27/12/97 -0500, David Megginson wrote:
>Peter Murray-Rust writes:
> > >1) (My suggestion.)  A pre-DOM interface, defining the events returned
> > >   by an XML parser, and providing enough information to build a DOM
> > >   tree (PIs, attributes, elements, data, DOCTYPE declarations, etc.).
> [...]
> > By "pre-DOM" I assume you mean:
> > 	- valid only until the DOM comes into effect
> > 	- (possibly) a subset of DOM functionality.
>Actually, I am using it in a linear-processing model (you must view
>this with a monospaced font like Courier):

[Note for hypermail readers - hypermail destroys any space-dependent
formatting. I don't think there is a way round this. In XML, of course,
there will be no problem - will there?]

>* David's Model:
>  PARSER --> SAX-J --> DOM --> [tree-based user application]
>               |
>               --------------> [event-based user application]
>In other words, a DOM builder would be just another an event-based
>SAX-J application.

This is roughly what JUMBO does at present. It uses the event-based
interfaces of lfred, Lark and NXP and builds a tree from the result. This
tree (or primitive grove) actually contains tree-structured representations
of PIs and DTDs.  Therefore JUMBO has no problem in using this SAX-J model.
[The question is whether SAX-J would offer doPI, doAttlist, etc.]

This suggests that implementers of the DOM (or other tree-related
interfaces) will build on top of SAX-J. IFF you/we can persuade them of
this, then great. If not, then there might be a tendency for SAX-J to
atrophy after the DOM.

> > >2) (Tim's suggestion.)  A post-DOM interface, for people who don't
> > >   want to learn the complexity of the DOM, and providing only the
> > >   minimum possible information (elements, attributes, and data).
> > 
> > By "post-DOM" I assume you mean "will not be onsoleted by the DOM", rather
> > than "cannot be put into operation until the DOM.
>In this case, I am using a slightly different model:
>* Tim's Model:
>  PARSER --> DOM --> SAX-J --> [event-based user application]
>              |
>              ---------------> [tree-based user application]
>In other words, SAX-J would be just a simpler event-based API for the

In this diagram there is a possible assumption that SAX-J can only be
finalised after the DOM is finalised. I hope not, because otherwise the
urgency will disappear. If, however, the DOM used different *terminology*
from SAX-J (and terminology is a prime concern of mine) then we should have
a conflict and a confusion.

>I don't see a very pressing need for the latter -- tree structures are
>familiar to coders, whether or not know anything about XML -- but I
>would be happy to implement it if requested (I do not want to exclude
>PI's, however, since that will exclude the possiblity of using
>architectural forms and other standards working on top of XML).

I do not wish PIs to be excluded either, since they are the primary
(suggested) mechanism for namespaces. However the problems of **knowing
what to do with PIs** far outweigh the problems of *reading* them :-). IOW
if I use doPI() (in Lark) or processingInstruction() in lfred I get a
result like:

PIString='snark="Boojum" vanish="softly silently"'

whose semantics are far more difficult than simply capturing the contents.
I'd just like to have a single syntax for:
	- the method name (e.g. doPI())
	- the 'target' (e.g. "target")
	- the rest-of-the-PI (e.g. PIString)
The same is even more important for NOTATION. If people are actually going
to use NOTATION, then *please* give us some handles for the bits :-)
>Is this what everyone else is expecting?

I was actually expecting:

  PARSER -->SAX-J --> [event-based user application]

where SAX-J is a black box that emits Attributes, Elements, Data (and
possibly PIs, NOTATIONs and DTD components in decreasing order of

I appreciate that for those constructing parsers, the question of where
SAX-J sits w.r.t. DOM is important and possibly affects their ease in
implementing SAX-J. And, of course, we must make it easy for parser
writers, since if it isn't they won't play. From the *user*'s point of
view, the interior of SAX-J is irrelevant :-)


My interpretation of "pre-" and "post-" on the time, rather than space,
coordinate can be dropped, except to say that it's critical not to waste
time "waiting for the DOM".
Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS, Virtual Hyperglossary

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list