Finishing SAX

Peter Murray-Rust peter at ursus.demon.co.uk
Sat Apr 18 12:35:45 BST 1998


At 21:13 17/04/98 -0400, David Megginson wrote:

Firstly we owe an enormous debt to David for his effort which was *far*
more than I imagined (and I'm sure more that he imagined as well). Past
history had shown that good intentions on XML-DEV didn't always lead to
finished robustness, and we'd also been down this road before. I think that
his self-imposed deadlines have been extremely useful and the discipline of
the process.

[...]
>It's time to finish SAX level 1: many people (both parser and
>application writers) have been waiting patiently, and I think that
>we're probably well past the 80 part of the 80/20 rule: no matter what
>we decide, SAX will be less well suited for some applications and

I think this is absolutely right. It's right to finish now. One very
important message is that *to get the interoperability that we want in XML*
we have to work very hard at the basics. Some of the 'simpler' issues have
turned out to be quite complex. However it's also clear that it is of
enormous benefit - without SAX I would have wasted more time than the
effort I have put into helping David and the process. And this is surely
true also for other developers.

>parsers than for others, and there will certainly be smug comments in
>the future about how we got obvious things wrong (the kind of comments
>that I, in moments of weakness, have been heard to make about other
>people's APIs).

I don't think there will be smug comments, especially since the process has
been open and the community 'owns' the result, any more than there are smug
comments about how 'XML got it wrong'. The balance is between technical
issues and people's ability to use the result effectively. It could be
valuable to present the *process* in the final version since I believe it's
as good as can be achieved by this - and perhaps any - process.
>
>I had originally planned SAX as two tiny interfaces occupying 1 or 2
>kilobytes, with extremely limited functionality.  What we've ended up
>with is the collective design of the XML membership, which is
>considerably larger and more complex than I had originally planned
>(though sax.jar file is still only 8,174 bytes long), but also much
>more functional and elegant -- it's not what I wanted, but I have to
>confess that I like it quite a bit.

I'd agree with this analysis. From an application programmer's point of
view the overall interface has a lot of functionality and to understand it
all involves a number of distinct issues, the latest being character
management. This is part of the learning and investment process - the good
news is that by learning SAX an implementer will learn a great deal about
XML systems in general - exceptions, streams, components of an XML
document, etc. 

A key resource - which David has provided, but which may be worth
commenting on further - is the provision of 'default' or introductory
implementations. An analogy with the Java SwingSet may be useful. This has
*zillions* of packages, with relatively little documentation and examples
for some of them. However there are special classes for 'beginners', such
as DefaultMutableTreeNode. [This is one of the main classes that I use to
build an interface to SAX] This provides 'almost everything' that the
newcomer needs to get off the ground very fast. In a similar way, SAXDemo
(if it's called that still) is *the* place to start until you need special
functionality. So documentation and guidance for newcomers is critical -
and I hope to address some of these in JUMBO2 when I get the final release
of SAX.
>
>I'd like to suggest that we allow a few more days for discussion, then
>simply stop at the end of the day next Tuesday (23 April) and give me
>the rest of the week (and possibly the weekend) to put together the
>final, official SAX level-one release.  If you have issues, speak now,
>or forever hold your peace (at least when I'm in the room).  In a
>separate message, I'll revisit the issue of byte and character
>streams.

I have kept quiet on issues such as character streams and error handling,
trusting is the communal judgment of XML-DEV to get this 'right'. It will
be important to give a road map of the interface and - where possible - to
identify those components which can be re-used outside SAX.  I assume that
the current discussion will have been useful to those considering the DOM
API and how it can be implemented.

There is also a role for library routines at this level. For example
'makeAbsoluteURL()' is useful elsewhere and could reasonably be highlighted
in the SAX distribution. [This is not strictly an API matter, but would
bring benefits.] In the same way generic tools for parser implementers such
as Name validation would be useful, and it might be useful to compile a
list of sax.Util in the distribution.
>
>As soon as we have this out of the way we can start talking about SAX
>level 2, which can support non-structural document events (like
>comments and CDATA sections), together with much more DTD information
>-- I already have some draft interfaces sketched out.

Wow! 

	P.


Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list