SAX: Byte Streams and Character Streams

James Clark jjc at
Sat Apr 18 04:50:03 BST 1998

Let's not forget other languages, in particular C and C++.  In C terms a
character stream would be a stream of wchar_t's, and a byte stream would
be stream of char's.  It's very common to pass information around
internally in char's (ie UTF-8 encoded) rather than in a stream of
wchar_t's (ie UTF-16 encoded): for example, expat which is being used
both in Netscape 5 and in Perl passes data to the application in UTF-8
as a sequence of bytes not as a sequence of wchar_t's. Supporting byte
streams only in the C/C++ world causes no inefficiency: if you have the
data as an array of wchar_t's, you can simply cast your wchar_t* to a
char* and you get an array of UTF-16 encoded bytes.   Byte streams gives
you all you need in the C/C++ world.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list