Slowness of JDK 1.1.x String.intern() [was Re: SAX, Java, and Namespaces ]

David Brownell db at Eng.Sun.COM
Fri Feb 12 07:05:54 GMT 1999

Tim Bray wrote:
> At 10:12 AM 2/5/99 -0800, Jeff Greif wrote:
> >JDK 1.1.7 intern is native, but is slow because it first converts the
> >characters in the string [to a canonical form]

No comment ... that's not my code ... ;-)

> Actually, the real reason that most XML parsers will *never* use
> built-in intern is because they probably have the name available in a
> character array, and can go look things up in the handcrafted
> table without String-i-fying it - thus skipping several steps
> of work that a built-in intern is going to have to do.  E.g. Lark's
> symbol table is a double array, storing both the character-array
> and String version of each name - you lookup based on the
> character array and return the string if it's already there.  The
> point is that you call new String() only once per unique name.

This gives "per-parse" uniqueness, which is valuable to a fair
degree beyond the performance win of avoiding allocating a new

However, Sun's package currently goes one step further and actually
interns that string.  It's such a small cost (on top of the cost
to check that array-to-string cache in the first place) that it's
barely measurable.  (Anyone try "java -Xrunhprof:cpu=samples ..." on
JDK 1.2/SPARC?) 

That provides "per-VM" uniqueness which has turned out to be handy
for things like stylesheet processing -- comparing strings in the
stylesheet and source document is quite fast, and that does add
up to a performance difference in template matching. 

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as: and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list