C++ – Storing Objects by Value or Reference in Container Classes

ccollectionsdatareference

I'm new to C++, coming from Java.

In Java, all variables (except for primitives) are essentially pointers. They hold the address of whatever they're 'holding'.

So any Java data structure stores it's data by reference. You can also store by value, i.e. save and return a copy of any item you store, but that would take extra work and isn't native to the language.

For example, the collections ArrayList, HashSet, and a simple array all store the addresses of the items they 'store', and not the actual items.

However in C++, you have a choice: when implementing a container class, you can either store and return to the user items by value or by reference.

For example, here's a simple Stack class I wrote (omitted irrelevant stuff):

template <typename T> class Stack {
public:
    Stack(...) : ... { }

    void push(const T& item) {
        if(size == capacity - 1)
            enlargeArray();

        data[indexToInsert++] = &item;
        size++;
    }

    const T& pop() {
        const T& item = *data[indexToInsert - 1];
        data[indexToInsert - 1] = 0;
        indexToInsert--;
        size--;
        return item;
    }

    int getSize() const {
        return size;
    }
private:
    const T** data;
    int indexToInsert;
    int size;
    int capacity;

    void enlargeArray() {
        // omitted
    }
};

This data structure takes and returns data by reference. push takes a const reference, and pop returns a const reference. The backing array is an array of pointers, not objects.

However push could also look like so:

    void push(T item) {
        if(size == capacity - 1)
            enlargeArray();

        data[indexToInsert++] = item;
        size++;
    }

And pop could return a T, not a const T&, etc.

My question is: what is the preferred approach in C++? Is there a preferred approach? Which approach should I normally take when implementing 'container' classes?

Best Answer

Firstly, you probably shouldn't implement a container class. 95% of the time you should one included in the standard library. If you just want to learn, or are in the 5%, carry on.

If you are defining a template, leave the decision up to your users. You users can use:

Stack<Foo> if they want by value. Stack<Foo*> if they want by pointer. Stack<std::unique_ptr<Foo>> if they want pointers that clean up after themselves.

When choosing which to use, you should default to by value, unless you've got a good reason to do something different. Inside your stack class, just store everything by value. If the use of the template needs indirection, they can use T=pointer type.

Looking at your code:

void push(const T& item) {
    if(size == capacity - 1)
        enlargeArray();

    data[indexToInsert++] = &item;
    size++;
}

You can't do that. &item records the pointer to whatever was passed in. But you have no idea how long the pointer will be valid for. It could become invalid right after push finished. In that case, you've stored a pointer to an invalid place. In general, you can't assume that a pointer remains valid. You should instead be copying the item.

Related Topic