SAX and Unicode question

Tim Bray tbray at
Tue Jan 6 00:49:53 GMT 1998

At 05:36 PM 05/01/98 -0700, Matthew J. Evans wrote:
>How is SAX going to handle Unicode, especially sending 16-bit chars 
>(UTF-16) to callback functions? Sending void*'s and/or char*'s in the 
>callbacks will leave the application and/or parser guessing what was sent. 
>Sending byte order marks in every string seems rather impractical, 
>especially since UTF-16 can have null bytes making most string objects 
>useless anyway.

SAX is a Java interface.  Thus the Strings and chars and so no are
all 16-bit-only; the parser will have taken care of all the BOMs
and encoding jiggery-pokery and so on.

On the IDL end of things, not sure what the right way to do it is. -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list