String interning (WAS: SAX2/Java: Towards a final form)

Tim Bray tbray at
Thu Jan 13 19:38:05 GMT 2000

At 06:04 PM 1/13/00 -0000, Miles Sabin wrote:
>Anyhow, maybe the waters are getting a bit muddied. I'm
>assuming that all parsers will do interning of one sort or
>another internally. The issue for me is how much of that gets
>exposed via the SAX API. I don't want java-interning exposed,
>because that means my parser has no option but to use

Yes.   Given that *every* credible parser does this, and that it's
a major convenience for programmers using the API to be able to compare
strings with ==, there is at some level an argument that we ought to
expose this fact.

I'd go further; based on having written a parser, it seems to me that
the only sane tactic is for the parser to use java.intern(), but only
once for each unique name, with some sort of internal char[] or
equivalent table.  If this is true, it's an even stronger argument for 
just saying "element types and attribute names coming out of the
parser are intern()ed, period".  

However, I would be totally against making this an optional feature
that the parser can decline to support, because then the value-add 
to the SAX customer goes down the toilet IMHO.

>But I'd much prefer it if the SAX API didn't expose any
>interning behaviour at all. I think we agree on that?

I think we're *arguing* about that... I don't detect agreement yet. -T.

