Why internalize? (was Re: SAX2: Namespace proposal)

David Megginson david at megginson.com
Tue Dec 21 14:32:12 GMT 1999

Joe Lapp <jlapp at webmethods.com> writes:

> I'm just questioning the use of intern in document APIs.  We
> use a special name object instead and force the app to select
> the appropriate name object to hand to the API. 

That's another type of interning.  It's important to remember that
while fast comparisons are a nice side-benefit, the main purpose of
interning strings or other objects is to guarantee that there is never
more than one equivalent object allocated -- otherwise, you can waste
an awful lot of memory.

To take one example, consider an attribute "security-level", allowed
for every element in a document, and with a default value.  If your
document has 5,000 elements (not an unusually large document), then
without some kind of internalization mechanism, you will end up
allocating 5,000 separate String objects, all with the value
"security-level".  If you internalize (somehow), then you have only
one String object (or compound Name object in Joe's case) that is
shared throughout the tree.

Internalizing can be tricky with mutable objects, but with immutable
objects like java.lang.String, it's a big win in this problem domain.

All the best,


David Megginson                 david at megginson.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

More information about the Xml-dev mailing list