C++ – Function inadvertently invalidates reference parameter – what went wrong

Today we found out the cause of a nasty bug that only happened intermittently on certain platforms. Boiled down, our code looked like this:

class Foo {
  map<string,string> m;

  void A(const string& key) {
    m.erase(key);
    cout << "Erased: " << key; // oops
  }

  void B() {
    while (!m.empty()) {
      auto toDelete = m.begin();
      A(toDelete->first);
    }
  }
}

The problem might seem obvious in this simplified case: B passes a reference to the key to A, which removes the map entry before attempting to print it. (In our case, it wasn't printed, but used in a more complicated way) This is of course undefined behavior, since key is a dangling reference after the call to erase.

Fixing this was trivial – we just changed the parameter type from const string& to string. The question is: how could we have avoided this bug in the first place? It seems both functions did the right thing:

A has no way of knowing that key refers to the thing it's about to destroy.
B could have made a copy before passing it to A, but isn't it the callee's job to decide whether to take parameters by value or by reference?

Is there some rule we failed to follow?

Best Answer

A has no way of knowing that key refers to the thing it's about to destroy.

While this is true, A does know the following things:

Its purpose is to destroy something.
It takes a parameter which is of the exact same type of the thing it will destroy.

Given these facts, it is possible for A to destroy its own parameter if it takes the parameter as a pointer/reference. This is not the only place in C++ where such considerations need to be addressed.

This situation is similar to how the nature of an operator= assignment operator means that you may need to be concerned about self assignment. That is a possibility because the type of this and the type of the reference parameter are the same.

It should be noted that this is only problematic because A later intends to use the key parameter after removing the entry. If it did not, then it would be fine. Of course, then it becomes easy to have everything working perfectly, then someone changes A to use key after it has potentially been destroyed.

That would be a good place for a comment.

Is there some rule we failed to follow?

In C++, you cannot operate under the assumption that if you blindly follow a set of rules, your code will be 100% safe. We cannot have rules for everything.

Consider point #2 above. A could have taken some parameter of a type different from the key, but the object itself could be a subobject of a key in the map. In C++14, find can take a type different from the key type, so long as there is a valid comparison between them. So if you do m.erase(m.find(key)), you can destroy the parameter even though the parameter's type isn't the key type.

So a rule like "if the parameter type and the key type are the same, take them by value" will not save you. You would need more information than just that.

Ultimately, you need to pay attention to your specific use cases and exercise judgment, informed by experience.

Related Solutions

C++ Pointers – Is Storing a Pass-by-Reference Parameter as a Pointer Bad Practice?

I'd say yes it is standard to have a reference that is taken as a pointer (as in I have seen it in multiple projects) and yes it is bad as it does hide the problem of object lifetime.

And the const cast is unnecessary and completely horrible. At the very least it should be a ref that is passed in, not a const ref.

As you suggest one way of improving it would be better to have 2 different functions one that takes a const ref and copies (setting the copy flag at the same time), and the other that takes a pointer, so the user is forced to explicitly call the non copy behaviour.

Better would be to have 2 different methods that have different names (maybe CopyValue and ReferenceValue), and better still would be to have a base class with 2 different sub classes classes that do the 2 different behaviours.

C++ – Storing Objects by Value or Reference in Container Classes

Firstly, you probably shouldn't implement a container class. 95% of the time you should one included in the standard library. If you just want to learn, or are in the 5%, carry on.

If you are defining a template, leave the decision up to your users. You users can use:

Stack<Foo> if they want by value. Stack<Foo*> if they want by pointer. Stack<std::unique_ptr<Foo>> if they want pointers that clean up after themselves.

When choosing which to use, you should default to by value, unless you've got a good reason to do something different. Inside your stack class, just store everything by value. If the use of the template needs indirection, they can use T=pointer type.

Looking at your code:

void push(const T& item) {
    if(size == capacity - 1)
        enlargeArray();

    data[indexToInsert++] = &item;
    size++;
}

You can't do that. &item records the pointer to whatever was passed in. But you have no idea how long the pointer will be valid for. It could become invalid right after push finished. In that case, you've stored a pointer to an invalid place. In general, you can't assume that a pointer remains valid. You should instead be copying the item.

Best Answer

Related Solutions

C++ Pointers – Is Storing a Pass-by-Reference Parameter as a Pointer Bad Practice?

C++ – Storing Objects by Value or Reference in Container Classes

Related Topic