AttributeList in C++ SAX

Steinar Bang sb at metis.no
Tue Oct 5 09:28:27 BST 1999


[I'm aware that I should have used wstring, but I'm putting off that
 debate for now]

As someone pointed out to me on Friday: using const string& as return
values is dangerous:

class sax_AttributeList {
public:
    enum AttrType {
	undefined,
	CDATA,
        ...
    };

    virtual ~sax_AttributeList();

    virtual int getLength() const = 0;
    virtual const string& getName(int i) const = 0;
    virtual AttrType getType(int i) const = 0;
    virtual const string& getValue(int i) const = 0;
    virtual AttrType getType(const string& name) const = 0;
    virtual const string& getValue(const string& name) const = 0;
};

Before you know it you'll be tempted to assign the return values to
local string& variables to get shorter syntax, ie. like this:
        ...
	string& aname = empty();
	for (int i=0; i<atts->getLength(); ++i) {
	    aname = atts->getName(i);
            ...
which will result in copying the contents of the return value of
getName into empty(), which is probably not the effect you're looking
for. 

This leaves us with the alternatives of returning string*:

class sax_AttributeList {
public:
    enum AttrType {
	undefined,
	CDATA,
        ...
    };

    virtual ~sax_AttributeList();

    virtual int getLength() const = 0;
    virtual const string* getName(int i) const = 0;
    virtual AttrType getType(int i) const = 0;
    virtual const string* getValue(int i) const = 0;
    virtual AttrType getType(const string& name) const = 0;
    virtual const string* getValue(const string& name) const = 0;
};

(returning a NULL pointer if not successful) or alterativelt, we can
return an error code, and give a reference to the return value as a
function argument, ie.:

class sax_AttributeList {
public:
    enum AttrType {
	undefined,
	CDATA,
        ...
    };

    virtual ~sax_AttributeList();

    virtual int getLength() const = 0;
    virtual bool getName(int i, string& name) const = 0;
    virtual bool getType(int i, AttrType& t) const = 0;
    virtual bool getValue(int i, string& val) const = 0;
    virtual bool getType(const string& name, AttrType& typ) const = 0;
    virtual bool getValue(const string& name, string& val) const = 0;
};

In both cases we'll end up assigning to a local variable and do some
testing: 
    ...
    const string* s;
    s = atts.getName(1);
    if (s) {
        ...
and 
    ...
    string name;
    if (atts.getName(1,name)) {
        ...

Even though it is a departure from the Java and (probably) Python
implementations, I much prefer the latter alternative, because:
 1. it supports very late UTF-8 decoding and copying from char* into
    the string class (if I use expat there has to be at least one such 
    copy)
 2. I can avoid building some of the data structures in the expat
    attribute list wrapper, minimizing the need for the mutable hack 
    (casting this to a non const* "that" in const functions that do
    lazy evaluation and caching of variables)
 3. I let the caller handle string object creation and destruction,
    again this minimizes copying (both gcc and MSVC++ string does
    shallow copying and deep-copy-on-write, but others (eg. SGI
    string) always do a deep copy) and eases memory management
 4. The bool return value can be changed to an enum and can return
    more exact errors, eg. NoTypeSupport, OutOfBounds

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list