SAX: Attributes and Entity Resolution

David Megginson ak117 at freenet.carleton.ca
Sun Jan 4 22:50:46 GMT 1998


During our discussion this weekend, we have had two excellent
proposals for additional SAX interfaces, beyond just XmlParser and
XmlProcessor:

1. An interface for entity resolution, rather than using a
   resolveEntity callback.

2. An interface for attributes, rather than using
   java.lang.Dictionary.


ENTITY RESOLUTION
-----------------

While I agree that a full entity manager would be more powerful than a
simple callback, I am not certain that the power will really be needed
by most SAX users; furthermore, if it is needed, that functionality
can be supplied more generally by an HTTP or FTP proxy server.  For
now, then, I recommend that we stick with the resolveEntity callback,
which is simple to use and to learn, but provides 80% of the required
functionality (that's 80% in the abstract 80/20 sense).


ATTRIBUTES
----------

The good arguments and patient explanations of list members have
convinced me that java.lang.Dictionary is unsuitable for three
reasons: because it is a base class rather than an interface, because
it is already deprecated in Java 1.2, and (most importantly) because
there is no single, obvious equivalent in other programming languages.

So what do we do?  It is certainly tempting to introduce a new
interface for attribute resolution, and that in itself would not bloat
SAX too much, but if we did that, why not add other interfaces?  It
would be nice, for example, to have an Element interface, an Entity
interface, a PI interface, a characterData interface, etc., all of
which implement useful functionality; in the end, we will have
rewritten the DOM.

The alternative is to return to what I had originally done with
Ælfred, and generate a separate event for each attribute:

  public void attribute (String elementName, String aname, String value);

For example, with the following markup

  <para id="p1" level="advanced">This is a paragraph.</para>

We would have five SAX events:

  attribute:    elementName="para" aname="id" value="p1"
  attribute:    elementName="para" aname="level" value="advanced"
  startElement: name="para"
  charData:     ch="This is a paragraph."
  endElement:   name="para"

This is not pretty, but it is simple, and should translate cleanly to
all languages.  I am far from decided on this point, and encourage
further public discussion.


All the best,


David

-- 
David Megginson                 ak117 at freenet.carleton.ca
Microstar Software Ltd.         dmeggins at microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list