SAX: towards a solution

Peter Murray-Rust peter at
Sat Jan 3 15:09:55 GMT 1998

At 07:51 03/01/98 -0500, David Megginson wrote:
>We have had an interesting discussion about SAX ("simple API for XML"?
>I cannot remember) the past few weeks, and now it's time to get
>specific.  I will be posting a series of separate messages, each
>containing a single SAX design question, and will look for a consensus
>on each one.  Here are the design topics that I will be posting on
>over the weekend:

David - this is brilliant. It is exactly the way that the XML-WG/SIG works
- and that works very well.  For the benefit of those who haven't read the
public XML-SIG archives (used to be called WG, confusingly) it is one of
the best decision-making processes I have come across anywhere, since it
combines precision, democracy, adherence to deliverables, etc. Works like

There is an editorial group (originally Editorial Review Board (ERB), now
XML-WG), composed of W3C member representatives and a few invited experts.
Currently 16 - see the PR. It has deliverables set by the W3C processes.
They invite a wider group (XML-SIG) - about 100 I think - which helps them
like this.

The WG propose a draft spec (or part of a spec) and have done this for XML,
XLL and XSL. Sometimes they will ask for general comments, other times they
will ask quite specific questions - exactly as DavidM is doing.  The SIG
members may then respond in whatever way they feel is appropriate -
sometimes discussing details, sometimes arguing about strategy. On occasion
the chair (Jon Bosak) may decide that the discussion is out of scope
(off-topic). The standard of discipline is very high, and members
invariably respect this.

The WG "meets" (phone conference) every week or so and makes formal
decisions . Votes can be taken and recorded. These are reported back to the
SIG as appropriate - sometimes continued discussion is requested. The WG
does a *great* deal of work - various members at SGML 97 Europe recounted
that XML had taken over their life. I believe that different specialities
are assigned to different individuals on occasion.

I have written this at length because I would like to see if XML-DEV could
"borrow" some of this process. [There are of course many differences - the
membership of XML-DEV is self-selecting (but still of very great technical
quality and list discipline). Of course there is a lot of overlap with the
SIG and WG.]. In the current situation I'd suggest that - if they can -
DavidM and TimB (no more) form a mini-WG and act as DavidM has suggested.
If this is too difficult (and of course *I* can't pay their phone bills),
then DavidM should do this unilaterally.

The miniWG should then solicit comments on the topics. Anyone should feel
to respond, but it should be with the aim of trying to help the miniWG make
decisions. [Offers of help may be very valuable here.] At the end of the
determined period (measured in days, not longer) the miniWG will present
the proposal. At that stage I would expect the proposal to be largely
finalised apart from some detailed corrections or clarifications.

*** When the miniWG submit their request for comments, could they use a
clear and simple subject for each topic, and could respondents use
precisely that***. e.g. SAX: DOCSTART/END. Also, let us stick closely to
the goal of simplicity.
>1) Document start and end.
>2) External entity start and end.
>3) External entity resolution.
>4) Error reporting.
>5) Whitespace handling.
>6) Processing instructions.
>7) Comments.
>8) Doctype declaration.
>9) Parser interface.
>10) Naming and packaging.

Looks fine to me. I'm not saying I'm *agreeing* with all these topics - but
that there are reasonable ones to ask questions about. Example: "Should SAX
consider whitespace?" - I would guess some people would answer "No". If the
miniWG takes the sense of the community, and marries it with the chance of
achieving something, then some decisions may indeed end up as NO. If it
makes sense to take these topics in some order, it may be useful to do so
(the WG often does not issue everything at once.) For example, Naming and
packaging might be asynchronous from whitespace.

>Before we start, I am assuming that we will all accept the following
>three events without further discussion, since no one objected last
>  startElement (String name, java.util.Dictionary attributes)
>  endElement (String name)
>  charData (char ch[], int length)

>I am also assuming that we will provide not only a callback interface,
>but also an (optional) base class with stub methods that implementors
>can override as needed; that means that novice users will not have to
>implement all of SAX, even if we do end up with nine or ten methods.
>For example, if you're only interested in character data, you can try
>something like this:
>  public class MyApplication extends XmlAppBase {
>    public void charData (char ch[], int length)
>    {
>      System.out.println(new String(ch, 0, length));
>    }
>  }
>There is not need for the programmer to worry about startElement(),
>endElement(), since they are already implemented as empty stubs in

I have found this very useful with AElfred. Does this have implications for
other languages (e.g. tcl is not OO - at least not when I used to use it.)?


>_Please_ keep this model in mind when you are commenting on the
>relative simplicity or complexity of SAX for users (as opposed to
>parser programmers) -- extra functionality is cheap, since it can be
>hidden away like this.
Agreed - the ability to "bring in " new functionality as one's application
evolves is very useful.  Having to do everything at the start can be tough
- there is not only more code to implement, but unnecessary concepts have
to be learned.


Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS, Virtual Hyperglossary

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list