Object Lifetime Invariants vs. Move Semantics in C++

cc++11invariantsobject-oriented-design

When I learned C++ long ago, it was strongly emphasized to me that part of the point of C++ is that just like loops have "loop-invariants", classes also have invariants associated to the lifetime of the object — things that should be true for as long the object is alive. Things that should be established by the constructors, and preserved by the methods. Encapsulation / access control is there to help you enforce the invariants. RAII is one thing that you can do with this idea.

Since C++11 we now have move semantics. For a class that supports moving, moving from an object does not formally end its life-time — the move is supposed to leave it in some "valid" state.

In designing a class, is it bad practice if you design it so that the invariants of the class are only preserved up to the point that it is moved from? Or is that okay if it will allow you to make it go faster.

To make it concrete, suppose I have a non-copyable but moveable resource type like so:

class opaque {
  opaque(const opaque &) = delete;

public:
  opaque(opaque &&);

  ...

  void mysterious();
  void mysterious(int);
  void mysterious(std::vector<std::string>);
};

And for whatever reason, I need to make a copyable wrapper for this object, so that it can be used, perhaps in some existing dispatch system.

class copyable_opaque {
  std::shared_ptr<opaque> o_;

  copyable_opaque() = delete;
public:
  explicit copyable_opaque(opaque _o)
    : o_(std::make_shared<opaque>(std::move(_o)))
  {}

  void operator()() { o_->mysterious(); }
  void operator()(int i) { o_->mysterious(i); }
  void operator()(std::vector<std::string> v) { o_->mysterious(v); }
};

In this copyable_opaque object, an invariant of the class established at construction is that the member o_ always points to a valid object, since there is no default ctor, and the only ctor that isn't a copy ctor guarantees these. All of the operator() methods assume that this invariant holds, and preserve it afterwards.

However, if the object is moved from, then o_ will then point to nothing. And after that point, calling any of the methods operator() will cause UB / a crash.

If the object is never moved from, then the invariant will be preserved right up to the dtor call.

Let's suppose that hypothetically, I wrote this class, and months later, my imaginary coworker experienced UB because, in some complicated function where lots of these objects were being shuffled around for some reason, he moved from one of these things and later called one of its methods. Clearly it's his fault at the end of the day, but is this class "poorly designed?"

Thoughts:

  1. It's usually bad form in C++ to create zombie objects that explode if you touch them.
    If you can't construct some object, can't establish the invariants, then throw an exception from the ctor. If you can't preserve the invariants in some method, then signal an error somehow and roll-back. Should this be different for moved-from objects?

  2. Is it enough to just document "after this object has been moved from, it is illegal (UB) to do anything with it other than destroy it" in the header?

  3. Is it better to continually assert that it is valid in each method call?

Like so:

class copyable_opaque {
  std::shared_ptr<opaque> o_;

  copyable_opaque() = delete;
public:
  explicit copyable_opaque(opaque _o)
    : o_(std::make_shared<opaque>(std::move(_o)))
  {}

  void operator()() { assert(o_); o_->mysterious(); }
  void operator()(int i) { assert(o_); o_->mysterious(i); }
  void operator()(std::vector<std::string> v) { assert(o_); o_->mysterious(v); }
};

The assertions don't substantially improve the behavior, and they cause a slow-down. If your project does use the "release build / debug build" scheme, rather than just always running with assertions, I guess this is more attractive, since you don't pay for the checks in the release build. If you don't actually have debug builds, this seems quite unattractive though.

  1. Is it better to make the class copyable, but not movable?
    This also seems bad and causes a performance hit, but it solves the "invariant" issue in a straightforward way.

What would you consider to be the relevant "best practices" here?

Best Answer

It's usually bad form in C++ to create zombie objects that explode if you touch them.

But that's not what you're doing. You're creating a "zombie object" that will explode if you touch it wrongly. Which is ultimately no different than any other state-based precondition.

Consider the following function:

void func(std::vector<int> &v)
{
  v[0] = 5;
}

Is this function safe? No; the user can pass an empty vector. So the function has a de-facto precondition that v has at least one element in it. If it does not, then you get UB when you call func.

So this function is not "safe". But that doesn't mean it's broken. It is only broken if the code using it violates the precondition. Maybe func is a static function used as a helper in the implementation of other functions. Localized in such a way, nobody would call it in a way that violates its preconditions.

Many functions, whether namespace-scoped or class members, will have expectations on the state of a value they operate on. If these preconditions are not met, then the functions will fail, typically with UB.

The C++ standard library defines a "valid-but-unspecified" rule. This says that, unless the standard says otherwise, every object which is moved from will be valid (it is a legal object of that type), but the specific state of that object is not specified. How many elements does a moved-from vector have? It doesn't say.

This means that you cannot call any function which has any precondition. vector::operator[] has the precondition that the vector has at least one element. Since you don't know the state of the vector, you cannot call it. It would be no better than calling func without first verifying that the vector is not empty.

But this also means that functions which do not have preconditions are fine. This is perfectly legal C++11 code:

vector<int> v1 = {1, 2, 3, 4, 5};
vector<int> v2{std::move(v1)};
v1.assign({6, 7, 8, 9, 10});

vector::assign has no preconditions. It will work with any valid vector object, even one that has been moved from.

So you're not creating an object that is broken. You're creating an object who's state is unknown.

If you can't construct some object, can't establish the invariants, then throw an exception from the ctor. If you can't preserve the invariants in some method, then signal an error somehow and roll-back. Should this be different for moved-from objects?

Throwing exceptions from a move constructor is generally considered... rude. If you move an object that owns memory, then you are transferring ownership of that memory. And that doesn't usually involve anything that can throw.

Sadly, we can't enforce this for various reasons. We have to accept that throwing-move is a possibility.

It should also be noted that you don't have to follow the "valid-yet-unspecified" language. That's simply the way the C++ standard library says that movement for standard types works by default. Certain standard library types have more strict guarantees. For example, unique_ptr is very clear about the state of a moved-from unique_ptr instance: it is equal to nullptr.

So you can choose to provide a stronger guarantee if you wish.

Just remember: movement is a performance optimization, one which is usually being done on objects that are about to be destroyed. Consider this code:

vector<int> func()
{
  vector<int> v;
  //fill up `v`.
  return v;
}

This will move from v into the return value (assuming the compiler doesn't elide it). And there is no way to reference v after the move has completed. So whatever work you did to put v into a useful state is meaningless.

In most code, the likelihood of using a moved-from object instance is low.

Is it enough to just document "after this object has been moved from, it is illegal (UB) to do anything with it other than destroy it" in the header?

Is it better to continually assert that it is valid in each method call?

The whole point of having preconditions is to not check such things. operator[] has a precondition that the vector have an element with the given index. You get UB if you try to access outside the size of the vector. vector::at does not have such a precondition; it explicitly throws an exception if the vector does not have such a value.

Preconditions exist for performance reasons. They're so that you don't have to check things that the caller could have verified for themselves. Every call to v[0] doesn't have to check if v is empty; only the first one does.

Is it better to make the class copyable, but not movable?

No. In fact, a class should never be "copyable but not movable". If it can be copied, then it should be able to be moved by calling the copy constructor. This is the standard behavior of C++11 if you declare a user-defined copy constructor but don't declare a move constructor. And it's the behavior you should adopt if you don't want to implement special move semantics.

Move semantics exist to solve a very specific problem: dealing with objects that have large resources where copying would be prohibitively expensive or meaningless (ie: file handles). If your object doesn't qualify, then copying and moving are the same to you.

Related Topic