C++ – Function inadvertently invalidates reference parameter – what went wrong

c

Today we found out the cause of a nasty bug that only happened intermittently on certain platforms. Boiled down, our code looked like this:

class Foo {
  map<string,string> m;

  void A(const string& key) {
    m.erase(key);
    cout << "Erased: " << key; // oops
  }

  void B() {
    while (!m.empty()) {
      auto toDelete = m.begin();
      A(toDelete->first);
    }
  }
}

The problem might seem obvious in this simplified case: B passes a reference to the key to A, which removes the map entry before attempting to print it. (In our case, it wasn't printed, but used in a more complicated way) This is of course undefined behavior, since key is a dangling reference after the call to erase.

Fixing this was trivial – we just changed the parameter type from const string& to string. The question is: how could we have avoided this bug in the first place? It seems both functions did the right thing:

  • A has no way of knowing that key refers to the thing it's about to destroy.
  • B could have made a copy before passing it to A, but isn't it the callee's job to decide whether to take parameters by value or by reference?

Is there some rule we failed to follow?

Best Answer

A has no way of knowing that key refers to the thing it's about to destroy.

While this is true, A does know the following things:

  1. Its purpose is to destroy something.

  2. It takes a parameter which is of the exact same type of the thing it will destroy.

Given these facts, it is possible for A to destroy its own parameter if it takes the parameter as a pointer/reference. This is not the only place in C++ where such considerations need to be addressed.

This situation is similar to how the nature of an operator= assignment operator means that you may need to be concerned about self assignment. That is a possibility because the type of this and the type of the reference parameter are the same.

It should be noted that this is only problematic because A later intends to use the key parameter after removing the entry. If it did not, then it would be fine. Of course, then it becomes easy to have everything working perfectly, then someone changes A to use key after it has potentially been destroyed.

That would be a good place for a comment.

Is there some rule we failed to follow?

In C++, you cannot operate under the assumption that if you blindly follow a set of rules, your code will be 100% safe. We cannot have rules for everything.

Consider point #2 above. A could have taken some parameter of a type different from the key, but the object itself could be a subobject of a key in the map. In C++14, find can take a type different from the key type, so long as there is a valid comparison between them. So if you do m.erase(m.find(key)), you can destroy the parameter even though the parameter's type isn't the key type.

So a rule like "if the parameter type and the key type are the same, take them by value" will not save you. You would need more information than just that.

Ultimately, you need to pay attention to your specific use cases and exercise judgment, informed by experience.