SAX: Byte Streams and Character Streams

James Clark jjc at jclark.com
Sat Apr 18 04:50:03 BST 1998


Let's not forget other languages, in particular C and C++.  In C terms a
character stream would be a stream of wchar_t's, and a byte stream would
be stream of char's.  It's very common to pass information around
internally in char's (ie UTF-8 encoded) rather than in a stream of
wchar_t's (ie UTF-16 encoded): for example, expat which is being used
both in Netscape 5 and in Perl passes data to the application in UTF-8
as a sequence of bytes not as a sequence of wchar_t's. Supporting byte
streams only in the C/C++ world causes no inefficiency: if you have the
data as an array of wchar_t's, you can simply cast your wchar_t* to a
char* and you get an array of UTF-16 encoded bytes.   Byte streams gives
you all you need in the C/C++ world.

James



xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list