C++ – How to actually implement the rule of five

cc++11move-semanticsrvalue-reference

UPDATE at the bottom

q1: How would you implement the rule of five for a class that manages rather heavy resources,
but of which you want it to be passed around by value because that greatly simplifies and beautifies it's usage? Or are not all five items of the rule even needed?

In practice, I'm starting something with 3D imaging where an image is usually 128*128*128 doubles.
Being able though to write things like this would make the math alot easier:

Data a = MakeData();
Data c = 5 * a + ( 1 + MakeMoreData() ) / 3;

q2: Using a combination of copy elision / RVO / move semantics the compiler should be able to this this with a minimum of copying, no?

I tried to figure out how to do this so I started with the basics; suppose an object implementing the traditional way of implementing copy and assignment:

class AnObject
{
public:
  AnObject( size_t n = 0 ) :
    n( n ),
    a( new int[ n ] )
  {}
  AnObject( const AnObject& rh ) :
    n( rh.n ),
    a( new int[ rh.n ] )
  {
    std::copy( rh.a, rh.a + n, a );
  }
  AnObject& operator = ( AnObject rh )
  {
    swap( *this, rh );
    return *this;
  }
  friend void swap( AnObject& first, AnObject& second )
  {
    std::swap( first.n, second.n );
    std::swap( first.a, second.a );
  }
  ~AnObject()
  {
    delete [] a;
  }
private:
  size_t n;
  int* a;
};

Now enter rvalues and move semantics. As far as I can tell this would be a working implementation:

AnObject( AnObject&& rh ) :
  n( rh.n ),
  a( rh.a )
{
  rh.n = 0;
  rh.a = nullptr;
}

AnObject& operator = ( AnObject&& rh )
{
  n = rh.n;
  a = rh.a;
  rh.n = 0;
  rh.a = nullptr;
  return *this;
}

However the compiler (VC++ 2010 SP1) is not too happy with this, and compilers are usually correct:

AnObject make()
{
  return AnObject();
}

int main()
{
  AnObject a;
  a = make(); //error C2593: 'operator =' is ambiguous
}

q3: How to solve this? Going back to AnObject& operator = ( const AnObject& rh ) certainly fixes it but don't we lose a rather important optimization opportunity?

Apart from that, it's clear that the code for the move constructor and assignment is full of duplication.
So for now we forget about the ambiguity and try to solve this using copy and swap but now for rvalues.
As explained here we wouldn't even need a custom swap but instead have std::swap do all the work, which sounds very promising.
So I wrote the following, hoping std::swap would copy construct a temporary using the move constructor, then swap it with *this:

AnObject& operator = ( AnObject&& rh )
{
  std::swap( *this, rh );
  return *this;
}

But that doesn't work out and instead leads to a stack overflow due to infinite recursion since std::swap calls our operator = ( AnObject&& rh ) again. q4: Can someone provide an example of what is meant in the example then?

We can solve this by providing a second swap function:

AnObject( AnObject&& rh )
{
  swap( *this, std::move( rh ) );
}

AnObject& operator = ( AnObject&& rh )
{
  swap( *this, std::move( rh ) );
  return *this;
}

friend void swap( AnObject& first, AnObject&& second )
{
  first.n = second.n;
  first.a = second.a;
  second.n = 0;
  second.a = nullptr;
}

Now there's almost twice the amount code, however the move part of it pays of by alowing pretty cheap moving; but on the other hand the normal assignment can't benefit from copy elision anymore.
At this point I'm really confused though, and not seeing anymore what's right and wrong, so I'm hoping to get some input here..

UPDATE So it seems there are two camps:

  • one saying to skip the move assignment operator and continue doing what C++03 taught us, ie write a single assignment operator that passes the argument by value.
  • the other one saying to implement the move assignment operator (after all, it's C++11 now) and have the copy assignment operator take its argument by reference.

(ok and there's the 3rd camp telling me to use a vector, but that's sort of out of scope for this hypothetical class. Ok in real life I would use a vector, and there would be also other members, but since the move constructor/assignment are not automatically generated (yet?) the question would still hold)

Unfortunately I cannot test both implementations in a real world scenario since this project has just started and the way the data will actually flow is not known yet. So I simply implemented both of them, added counters for allocation etc and ran a couple of iterations of approx. this code, where T is one of the implementations:

template< class T >
T make() { return T( narraySize ); }

template< class T >
void assign( T& r ) { r = make< T >(); }

template< class T >
void Test()
{
  T a;
  T b;
  for( size_t i = 0 ; i < numIter ; ++i )
  {
    assign( a );
    assign( b );
    T d( a );
    T e( b );
    T f( make< T >() );
    T g( make< T >() + make< T >() );
  }
}

Either this code is not good enough to test what I'm after, or the compiler is just too smart: doesn't matter what I use for arraySize and numIter, the results for both camps are pretty much identical: same number of allocations, very slight variations in timing but no reproducable significant difference.

So unless someone can point to a better way to test this (given that the actual usage scnearios are not known yet), I'll have to conclude that it doesn't matter and hence is left to the taste of the developper. In which case I'd pick #2.

Best Answer

You've missed a significant optimization in your copy assignment operator. And subsequently the situation has gotten confused.

  AnObject& operator = ( const AnObject& rh )
  {
    if (this != &rh)
    {
      if (n != rh.n)
      {
         delete [] a;
         n = 0;
         a = new int [ rh.n ];
         n = rh.n;
      }
      std::copy(rh.a, rh.a+n, a);
    }
    return *this;
  }

Unless you really never think you'll be assigning AnObjects of the same size, this is much better. Never throw away resources if you can recycle them.

Some might complain that the AnObject's copy assignment operator now has only basic exception safety instead of strong exception safety. However consider this:

Your clients can always take a fast assignment operator and give it strong exception safety. But they can't take a slow assignment operator and make it faster.

template <class T>
T&
strong_assign(T& x, T y)
{
    swap(x, y);
    return x;
}

Your move constructor is fine, but your move assignment operator has a memory leak. It should be:

  AnObject& operator = ( AnObject&& rh )
  {
    delete [] a;
    n = rh.n;
    a = rh.a;
    rh.n = 0;
    rh.a = nullptr;
    return *this;
  }

...

Data a = MakeData();
Data c = 5 * a + ( 1 + MakeMoreData() ) / 3;

q2: Using a combination of copy elision / RVO / move semantics the compiler should be able to this this with a minimum of copying, no?

You may need to overload your operators to take advantage of resources in rvalues:

Data operator+(Data&& x, const Data& y)
{
   // recycle resources in x!
   x += y;
   return std::move(x);
}

Ultimately resources ought to be created exactly once for each Data you care about. There should be no needless new/delete just for the purpose of moving things around.