SAX C/C++ Implementations?

mlepage at mlepage at
Tue Sep 28 17:53:35 BST 1999

On Tue, Sep 28, 1999 at 12:44:49PM +0200, Steinar Bang wrote:
> >>>>> Holger Flörke <hf at>:
> > At 11:08 24.09.99 -0400, you wrote:
> >> I occasionally receive e-mail about people who've attempted SAX
> >> implementations in C or C++.  I know about IBM's and a couple of
> >> private ones, but I'd like to try to smoke out any others that might
> >> be lurking.  Please reply if you have one.
> > Based upon Expat I recently started working on a C++ implementation
> > of SAX.  Currently only the basic things (as DocumentHandler,
> > Parser, Locator and SAXExcpetion) are designed, implemented and
> > *seem* to work well.
> > I hope there will be a standardized C++ interface in the
> > future. Feel free to contact for more information.
> After looking at 
> and
> I wrote my own implementation, which like the one above and the two in 
> the URLs also includes an expat wrapper.

Expat seems particularly suited for quick wrapping in SAX.
> It's not for me to make the code public, but if anybody are
> interested, I can ask my employer if I'm allowed to.
> Since I had to make this work on some parsers without proper namespace
> support I prefixed the classes with "sax_".

You mean compilers? Sounds reasonable, but I'd like to see any C++ implementations account for the possibility of using C++ namespaces. That's what they're for!

> Like Jez Higgins I'm returning "const string&" in the attribute list,
> rather than the "const string*" of the Minion SAX implementation.  If
> I'm accessing a non-existing attribute the returned value is a
> reference to an empty string.

How can the application tell the difference between a non-existent attribute, and an empty attribute?

I can think of cases where, if not available, an application might find an attribute value using some other mechanism (e.g. ask the user). How can the XML document set the attribute to an empty string, without invoking that other behaviour?

> Wrt. to the issues raised in
> I'm also using the iostream classes instead of the iowstream classes,
> and string instead of wstring (issues 2, 3, and 4).
> This is because I don't have iowstream and not full std::iostream
> compliance on any of the platforms I'm working on, and our program is
> currently using string internally.
> Right now I'm decoding UTF-8 into ISO-8859-1 and throwing away
> everything that doesn't fit.  This is not a long term strategy, but
> for now it'll do.

That all sounds reasonable, especially for a private (protected?) implementation. My personal feeling is that as long as we are close, if a standardized version becomes available, it shouldn't be too hard to get in line with the standard. More so than if we just used whatever parser interface was available, and not a SAX-like interface.

Still, it would be nice to work out some of these issues. Is Jez, or any of the IBM alphaworks developers, on this list? I emailed Jez before about compilation problems, he seemed keen on helping and improving his work.

Marc Lepage  (aka SEGV)
RTS game programming info, Minion open source game, etc.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as: and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list