SAX: Attributes and Entity Resolution

James Clark jjc at jclark.com
Mon Jan 5 04:59:22 GMT 1998


David Megginson wrote:

> While I agree that a full entity manager would be more powerful than a
> simple callback, I am not certain that the power will really be needed
> by most SAX users; furthermore, if it is needed, that functionality
> can be supplied more generally by an HTTP or FTP proxy server.  For
> now, then, I recommend that we stick with the resolveEntity callback,
> which is simple to use and to learn, but provides 80% of the required
> functionality (that's 80% in the abstract 80/20 sense).

I don't think the entity manager interface has to be any more
complicated than a single resolveEntity callback.  My main point is that
this doesn't belong as part of the App.  Putting separate pieces of
functionality into separate interfaces does not make things harder to
use and learn; on the contrary it makes it easier.

>   public void attribute (String elementName, String aname, String value);
> 
> For example, with the following markup
> 
>   <para id="p1" level="advanced">This is a paragraph.</para>
> 
> We would have five SAX events:
> 
>   attribute:    elementName="para" aname="id" value="p1"
>   attribute:    elementName="para" aname="level" value="advanced"
>   startElement: name="para"

Putting attributes before the start element would be seriously
confusing: in the markup the element name comes before the attributes,
and the attributes are logically part of the element. Having an
attribute callback that happens after the startElement makes some
sense.  It is to some extent arbitrary whether information is
represented as subelements or attributes; having subelements and
attributes be represented in a similar way would be consistent with
this.

I think I would also pass the element type name as an additional
argument to the attribute call, since the name of an attribute is in
general meaningful only in the context of a particular element type.

It's also useful to know when the attributes have ended and the content
is starting, and I would have a callback for this that also passed the
element type name.

This isn't pretty.

An alternative simple approach would be to have

startElement(String elemName, String[] attNames, String[] attVals, int
nAtts);

As with the charData callback, the parser would be free to mutate the
arrays once the startElement method returns, and so it would not need to
allocate two arrays for every element.

James



xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list