SAX2: Namespace Processing and NSUtils helper class

David Megginson david at megginson.com
Wed Dec 15 14:55:07 GMT 1999


OK, back to SAX2 for now.  I'm doing some serious projects with RDF
and Namespaces right now, so I've done a lot of thinking about how we
can make SAX2 Namespace processing both efficient and
backwards-compatible.

I'm pretty sure that the best choice is to use James Clark's
{URI}localpart notation for Namespace-qualified names, so that an
XHTML <p> element (for example) will be reported as
"{http://www.w3.org/1999/xhtml}p".

Unfortunately, that creates some potential inefficiencies, especially
for Java, which is painfully slow at string processing (compared to
C/C++).  To work around this problem, I've designed a new SAX2 helper
class, NSUtils, with the following static methods:

  public boolean isQualified (String name)
  public String [] splitName (String name)
  public String joinName (String uri, String local)

The first of these is very simple -- it just checks whether the first
character is '{' (as it always must be for a qualified name).  The
other two, however, use static hashtables to cache their work, so that 
they're pretty efficient to call over and over again.

For example, the first time you call

  splitName("{http://www.w3.org/1999/xhtml}p")

the method will use java.lang.String.indexOf and java.lang.String to
pick out the URI part "http://www.w3.org/1999/xhtml" and the local
part "p" and will return them as a two-member String array, which it
will also store in a Hashtable.

The next time (or 1,000 times) you call

  splitName("{http://www.w3.org/1999/xhtml}p")

the method will find the string already in the hash table and will
return the same two-member array that it returned last time (or should
it be a copy?  I wish Java had const) without repeating any of the
expensive string operations.

I use a similar approach for joinName(), which makes writing a
NamespaceFilter extremely efficient.

Does this sound like a reasonable approach to the Java-heads out
there?  I'll send the source out in a separate message, since it's
only three screens or so.


Thanks, and all the best,


David

-- 
David Megginson                 david at megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list