String interning (WAS: SAX2/Java: Towards a final form)

Tyler Baker tyler at infinet.com
Mon Jan 17 21:36:54 GMT 2000


Miles Sabin wrote:

> Tyler Baker wrote,
> > For most documents, making a call to String.intern()
> > 50-100 times in a 100KB document is a lot less expensive than
> > doing:
> >
> > if (x.equals("foo") {
> >
> > }
> > else if (x.equals("bar") {
> >
> > }
> > etc...
> >
> > As opposed to:
> >
> > if (x == "foo") {
> >
> > }
> > else if (x == "bar) {
> >
> >}
> > etc.
> >
> > Calling the equals method can get expensive for large case > > statements.
>
> Sure ... so Don't Do That (tm) ...
>
> If you've got what's effectively a huge case statement then
> use a lookup table ... a HashMap or a trie of some sort or
> another. Either of those will outperform a large number of
> chained String.equals() calls or == tests.

If you are using interned strings, a HashMap becomes much faster anyways. But that will
not outperform a bunch of identity tests as they are one of the fastest operations in
Java. As someone who has written an XML Parser before, I can tell you and so can many of
the other people on this list, that at the application level it is much faster to deal
with interned strings in case logic, than otherwise.

Even if you choose not to do identity tests in your case logic, using a HashMap or just
using String.equals() will be much faster because the implementation of the
String.equals() method goes like this:

public boolean equals(Object anObject) {
 if (this == anObject) {
     return true;
 }
 if ((anObject != null) && (anObject instanceof String)) {
  String anotherString = (String)anObject;
  int n = count;
  if (n == anotherString.count) {
   char v1[] = value;
   char v2[] = anotherString.value;
   int i = offset;
   int j = anotherString.offset;
   while (n-- != 0) {
     if (v1[i++] != v2[j++]) {
      return false;
     }
   }
   return true;
  }
 }
 return false;
}

As you can see an identity test is done right off the bat so if you make a call like this:

if (foo.equals("x")) {

}

And foo == "x", then your only additional overhead to a straight identity test is the
dynamic method invocation of String.equals().

Tyler


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ or CD-ROM/ISBN 981-02-3594-1
Please note: New list subscriptions now closed in preparation for transfer to OASIS.





More information about the Xml-dev mailing list