SAX2: NSUtils.java

Miles Sabin msabin at cromwellmedia.co.uk
Wed Dec 15 17:19:56 GMT 1999


David Megginson wrote,
> I'm attaching a copy of the Java source code for the (short) 
> NSUtils class that I described in the last message.  I'd be 
> very grateful if the Java specialists on the list could look 
> this over, paying special attention to synchronization 
> problems.

I fear there are some big problems here. In particular,

  public final class NSUtils
  {
    private static QName qName;

    public static String joinName
      (String uriPart, String localPart)
    {
	qName.uri = uriPart;
	qName.local = localPart;
	String name = (String)joinNameTable.get(qName);
      
      // ... etc ...
    }
  }

Is rather nastily thread-unsafe: the shared qName could be
read/written by multiple threads in joinName(). You should
either synchronize this method, or create a new QName locally.
There are other less important problems here too ... this one 
leapt out at me immediately.

To be honest tho', I don't think it's likely to be worth the 
effort of finding and fixing them all. Despite the rumours about 
Javas poor string handling, it's really not all that bad, and 
it's quite likely that your attempts at optimization will do more 
harm than good, bearing in mind that any shared caches will 
involve either synchronization or replication (with object 
creation as a consequence). At the very least you should do some 
benchmarks to see whether there's any gain from going down this 
route ... what was it Knuth said about optimization? ;-)

Why not just do something like this,

  public class QName
  {
    private String itsURIPart;
    private String itsLocalPart;

    public QName(String uName)
    {
      if(uName.charAt(0) == '{')
      {
        int endPos = uName.lastIndexOf('}');
        if(endPos == -1)
          throw new IllegalArgumentException
            ("Malformed Namespace name: " + uName);
		
        itsURIPart = uName.substring(1, endPos);
        itsLocalPart = uName.substring(endPos+1);
      }
      else
      {
        itsURIPart = null;
        itsLocalPart = uName;
      }
    }

    public QName(String uriPart, String localPart)
    {
      if(localPart == null)
        throw new IllegalArgumentException();

      itsURIPart = uriPart;
      itsLocalPart = localPart;
    }

    public String getURIPart()
    {
      return itsURIPart;
    }

    public void setURIPart(String uriPart)
    {
      itsURIPart = uriPart;
    }

    public String getLocalPart()
    {
      return itsLocalPart;
    }

    public void setLocalPart(String localPart)
    {
      if(localPart == null)
        throw new IllegalArgumentException();

      itsLocalPart = localPart;
    }

    public boolean equals(Object other)
    {
      if(other instanceof QName)
        return equals((QName)other);
      else if(other instanceof String)
        return equals(new QName((String)other));
      else
        return false;
    }

    public boolean equals(QName other)
    {
      return
        !itsLocalPart.equals(other.itsLocalPart) &&
        (itsURIPart == other.itsURIPart ||
         (itsURIPart != null && itsURIPart.equals(other.itsURIPart)));
    }

    public int hashCode()
    {
      return
        (itsURIPart != null ? itsURIPart.hashCode() : 0) ^
        itsLocalPart.hashCode();
    }

    public String toString()
    {
      return toString(uriPart, localPart);
    }

    public static String toString(String uriPart, String localPart)
    {
	if(itsURIPart != null)
      {
        StringBuffer buffer = new StringBuffer();
        buffer.append('{');
        buffer.append(uriPart);
        buffer.append('}');
        buffer.append(localPart);

        return buffer.toString();
      }
      else
        return itsLocalPart;
    }
  }

As is traditional, this is untried and untested ;-)

If someone wants to split a name they simply constuct a
new QName ... no more expensive than constucting a new String[]
to hold the 2 parts, and rather more convenient.

Here a some ideas for possible optimizations _if_ benchmarking
suggests that they're needed,

1. The universal name built in toString() or passed in via the
   one arg constructor could be cached (and invalidated if
   either of the set methods are called).

2. The hashcode could be cached (and invalidated if either of
   the set methods are called).

3. The StringBuffer.append()'s in toString() could be wrapped in
   a synchronized block on the buffer, ie.,

   synchronized(buffer)
   {
     buffer.append(...);
     buffer.append(...);
     //etc.
   }

   This gives a marginal speedup over repeated grabbing and
   releasing of the buffers monitor (which is done internally to
   StringBuffer.append()).

4. Selected methods or the whole class could be marked as final.

5. Strings could be intern'ed (note that intern'ing a String is
   a comparatively expensive operation involving a JVM internal
   hash lookup, so shouldn't be done willy nilly).

6. Some class-scope caching could be used (ie. something like
   the caching your NSUtils does, only mapping between Strings
   and QNames rather than between Strings and String[]s).

I'm very doubtful that it'd be worth going down below (4).

Cheers,


Miles

-- 
Miles Sabin                       Cromwell Media
Internet Systems Architect        5/6 Glenthorne Mews
+44 (0)20 8817 4030               London, W6 0LJ, England
msabin at cromwellmedia.com          http://www.cromwellmedia.com/



xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list