NameChar (was: Editing text)

David Megginson ak117 at
Fri Nov 28 16:21:40 GMT 1997

Richard Tobin writes:

 > > The fastest solution would be to maintain a static 65,536
 > > (or at least 32,768) entry array, with bit flags for different
 > > character properties.  That would be fine for big programs, but it
 > > would kill Java applets
 > Bear in mind that the main problem of size for Java applets is the
 > time taken for downloading, rather than the memory used at runtime.
 > So it may well be practical to store the data in a compact-but-slow
 > form and use that to initialise a large-but-fast lookup table.

(I hear that memory _is_ a problem right now on Windows systems, since
both Netscape and (especially) MSIE 4 bloat to ridiculous sizes,
sometimes double or triple the typical 32MB of RAM on people's
systems; however, an extra 64k or so would make little difference).

The best optimisation will depend on your expected usage.  If, for
example, you expect that 80% of all characters would be <=0x007f, then
Tim's approach of using a bit-array for those characters and jumping
to a hairy lookup method for the rest would make sense; if, however,
you expected that some documents might be almost entirely encoded with
characters >=0x0080 (say, in Han Chinese characters), then a 64K
lookup table would be necessary for acceptable performance.  If you
were keeping only one bit for each character, then you could encode a
compact lookup table in only 4K.

All the best,


David Megginson                 ak117 at
Microstar Software Ltd.         dmeggins at

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list