Request for Discussion: SAX 1.0 in C++

John Aldridge john.aldridge at informatix.co.uk
Tue Dec 14 12:56:45 GMT 1999


At 13:10 14/12/99 +0100, Steinar Bang <sb at metis.no> wrote:
>>>>>> John Aldridge <john.aldridge at informatix.co.uk>:
>
>> Why?  What's wrong with storing UTF-16 encoded data in a 32 bit
>> wchar_t?  I know it uses more storage space; but there won't
>> typically be that much data around in this format at once.
>
>We store a lot of strings, so I think a quadrupling of the storage
>space compared with what we do today, or doubling wrt. to UTF-16, will
>be significant.

I'm guessing that this will be fairly unusual, though.  I suspect that most
clients of such a streaming interface will be processing the data on the
fly, and not hanging on to large chunks of it for the duration of the
program run.

Of course, you don't have to store the strings in your data structures in
the same format as they are passed to you from SAX.

>> I'd much rather have the format defined to be wstring (or wchar_t*, if you
>> must, but that's another debate), because of the compatibility with wide
>> string literals.
>
>Hm... I don't know anything about wide string literals and their
>behaviour wrt. to wstring, text editors and debuggers.  Could you
>elaborate, maybe...?

Brief summary:

    L'a'   is a wchar_t containing the character 'a'
    L"abc" is s wchar_t[] containing the characters 'a', 'b', 'c', '\0'

basic_string<wchar_t> (aka wstring) has constructors and comparison
operators and the like which take wchar_t* arguments.

It seems to me that code like:

void DocumentHandler::startElement (
    const std::wstring &name, const AttributeList &atts)
{
    if (name == L"Paragraph") ...
}

is going to be a whole lot neater than

void DocumentHandler::startElement (
    const std::basic_string<SAXChar> &name, const AttributeList &atts)
{
    static const SAXChar paraString[] =
        {'P','a','r','a','g','r','a','p','h',\0'};
    if (name == paraString) ...
}
-- 
Cheers,
John

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list