SAX: Status Report

David Megginson ak117 at freenet.carleton.ca
Mon Jan 5 03:23:40 GMT 1998


Here is a preliminary status report on SAX, summarising both public
and private correspondence.  This is, in fact, _very_ preliminary,
since some potential participants read e-mail only at work, and will
not even know about this thread until tomorrow (Monday) morning.


CORE EVENTS
-----------

So far, there seems to be general agreement on the following event
callbacks for the XmlApplication interface:

 public void startDocument ();
 public void endDocument ();
 public void endElement (String name);
 public void processingInstruction (String name, String remainder);
 
There is general agreement that the following two should be present,
but still discussion over their exact form (I'm still tweaking the
names a bit):

 public void characters (char ch[], int start, int length, ...?);
 public void startElement (String name, ...?); 

(For the first, there is the question of a flag for ignorable
whitespace, and for the second, the question of how to report
attributes).


ENTITIES
--------

There has been a lively and well-informed discussion on entity
handling.  Many participants are comfortable with something like the
following for external entities (including the external DTD subset,
which may contain processing instructions):

  public void startEntity (String ename, String publicID, String systemID);
  public void endEntity (String ename);

(There is also a question about whether public IDs should be
provided).  Some others suggest that SAX should provide no information
about external entities, while others suggest that the XmlParser
interface should have a getLocation() method instead.  The main
motivation for providing external-entity information (aside from error
reporting) is to resolve relative URIs in attribute values.

On the issue of entity resolution, there has been less feedback,
probably because the topic is a little confusing.  I have suggested
something like this

  public String resolveEntity (String ename, String publicID, String systemID);

which would allow simple URI substitution and resolution of public
identifiers, if desired (in most cases, you could simply return the
systemID argument unmodified).  Another suggestion is a separate
EntityManager interface which would allow much more functionality.


ERROR REPORTING
---------------

A majority of participants seem to support using callbacks for error
reporting, partially to simplify cross-language support:

  public void warning (String message, int line, int column);
  public void fatal (String message, int line, int column);

Note the addition of the 'column' argument -- it has rightly been
pointed out that XML documents can consist of a single, long line, so
the line number itself may be useless.  If we do not have some general
way to determine the current entity (i.e. startEntity and endEntity),
we will also have to supply the URI of the current entity here.


PROLOG
------

No one sees a need for startProlog and endProlog events, but several
people would like to see an event for the DOCTYPE, if present:

  public void doctype (String name, String publicID, String systemID);

where publicID and systemID refer to the external DTD subset, if any.
This would help with autodetection of different document types.


COMMENTS
--------

Most people agree that there is no need for SAX to report comments.


PARSER
------

Everyone seems to like the idea of a common parser interface.


ARTIST'S RENDITION
------------------

Things are still up in the air, but here is some indication of what
SAX's central XmlApplication interface might look like in Java:


/* Beginning of XmlApplication.java */

public interface XmlApplication {

  //
  // Entities
  //
  public String resolveEntity (String ename, String publicID, String systemID);
  public void startEntity (String ename, String publicID, String systemID);
  public void endEntity (String ename);

  //
  // Document structure
  //
  public void startDocument ();
  public void endDocument ();
  public void doctype (String name, String publicID, String systemID);
  public void startElement (String name /* and attributes, somehow */);
  public void endElement (String name);
  public void characters (char ch[], int start, int length, boolean ignorable);
  public void processingInstruction (String name, String remainder);

  //
  // Error reporting
  //
  public void warning (String message, int line, int column);
  public void fatal (String message, int line, int column);

}

/* end of XmlApplication.java */


All of these would have default implementations in XmlAppBase -- the
seven core document-structure callbacks would all have empty
implementations.


All the best,


David

-- 
David Megginson                 ak117 at freenet.carleton.ca
Microstar Software Ltd.         dmeggins at microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list