SAX2/Java: Towards a final form

Lars Marius Garshol larsga at garshol.priv.no
Mon Jan 10 17:50:16 GMT 2000


* Stefan Haustein
|
| Is "no namespace" reported with a null or empty String (for interned
| Strings, the equals problem does not exist)?

* David Megginson
| 
| Empty string sounds like a reasonable suggestion when Namespace
| processing is being performed; null when it is not (so that a bugs
| in code will show up sooner).

There is a problem with this: SAX filters should be able to compare
names without knowing whether namespace processing is on or not.
Allowing parts of names to be null makes this much more complicated,
since this is a comparison of two three-string tuples. So from a
filter point of view it would be much better if no part of a name
could ever be null. (I'm a bit unsure what to do with the raw name
when there is no original raw name.)

| That's a good question -- should SAX2 require that all names and
| Namespace URIs be interned (i.e. == to the results of
| java.lang.String.intern)?

This sounds like it could cause a huge performance gap between
implementations. I think the MSXML driver and the SAX1 adapter will
have to intern every name-part string that is passed to them, which I
assume would be very costly. (The alternative would be breaking
applications, unless there is a cheaper way.)

Also, many parsers already do their own interning and support for
SAX2, and these would then require either the solution above or a
(non-costly) change to the parser itself. This definitely sounds like
something that is easily forgotten, thus causing incompatibilities.


Perhaps we should instead add an optional read-only property that
returns an implementation of a StringInterner interface, which could
look something like:

  public interface StringInterner {

    public String intern(String aString);
    public void clear(); // empties all internal tables, but is not
                         // guaranteed to have an effect

  }

Support for this property would then imply that all reported name-part
strings are interned. We could also add a default implementation that
just does aString.intern(), to make life easier for the implementors.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list