I just finished listening to the Software Engineering radio podcast interview with Scott Meyers regarding C++0x. Most of the new features made sense to me, and I am actually excited about C++0x now, with the exception of one. I still don't get move semantics… What is it exactly?
C++ – move semantics
cc++-faqc++11move-semantics
Related Solutions
A pointer can be re-assigned:
int x = 5; int y = 6; int *p; p = &x; p = &y; *p = 10; assert(x == 5); assert(y == 10);
A reference cannot be re-bound, and must be bound at initialization:
int x = 5; int y = 6; int &q; // error int &r = x;
A pointer variable has its own identity: a distinct, visible memory address that can be taken with the unary
&
operator and a certain amount of space that can be measured with thesizeof
operator. Using those operators on a reference returns a value corresponding to whatever the reference is bound to; the reference’s own address and size are invisible. Since the reference assumes the identity of the original variable in this way, it is convenient to think of a reference as another name for the same variable.int x = 0; int &r = x; int *p = &x; int *p2 = &r; assert(p == p2); // &x == &r assert(&p != &p2);
You can have arbitrarily nested pointers to pointers offering extra levels of indirection. References only offer one level of indirection.
int x = 0; int y = 0; int *p = &x; int *q = &y; int **pp = &p; **pp = 2; pp = &q; // *pp is now q **pp = 4; assert(y == 4); assert(x == 2);
A pointer can be assigned
nullptr
, whereas a reference must be bound to an existing object. If you try hard enough, you can bind a reference tonullptr
, but this is undefined and will not behave consistently./* the code below is undefined; your compiler may optimise it * differently, emit warnings, or outright refuse to compile it */ int &r = *static_cast<int *>(nullptr); // prints "null" under GCC 10 std::cout << (&r != nullptr ? "not null" : "null") << std::endl; bool f(int &r) { return &r != nullptr; } // prints "not null" under GCC 10 std::cout << (f(*static_cast<int *>(nullptr)) ? "not null" : "null") << std::endl;
You can, however, have a reference to a pointer whose value is
nullptr
.Pointers can iterate over an array; you can use
++
to go to the next item that a pointer is pointing to, and+ 4
to go to the 5th element. This is no matter what size the object is that the pointer points to.A pointer needs to be dereferenced with
*
to access the memory location it points to, whereas a reference can be used directly. A pointer to a class/struct uses->
to access its members whereas a reference uses a.
.References cannot be put into an array, whereas pointers can be (Mentioned by user @litb)
Const references can be bound to temporaries. Pointers cannot (not without some indirection):
const int &x = int(12); // legal C++ int *y = &int(12); // illegal to take the address of a temporary.
This makes
const &
more convenient to use in argument lists and so forth.
UPDATE
This answer is rather old, and so describes what was 'good' at the time, which was smart pointers provided by the Boost library. Since C++11, the standard library has provided sufficient smart pointers types, and so you should favour the use of std::unique_ptr
, std::shared_ptr
and std::weak_ptr
.
There was also std::auto_ptr
. It was very much like a scoped pointer, except that it also had the "special" dangerous ability to be copied — which also unexpectedly transfers ownership.
It was deprecated in C++11 and removed in C++17, so you shouldn't use it.
std::auto_ptr<MyObject> p1 (new MyObject());
std::auto_ptr<MyObject> p2 = p1; // Copy and transfer ownership.
// p1 gets set to empty!
p2->DoSomething(); // Works.
p1->DoSomething(); // Oh oh. Hopefully raises some NULL pointer exception.
OLD ANSWER
A smart pointer is a class that wraps a 'raw' (or 'bare') C++ pointer, to manage the lifetime of the object being pointed to. There is no single smart pointer type, but all of them try to abstract a raw pointer in a practical way.
Smart pointers should be preferred over raw pointers. If you feel you need to use pointers (first consider if you really do), you would normally want to use a smart pointer as this can alleviate many of the problems with raw pointers, mainly forgetting to delete the object and leaking memory.
With raw pointers, the programmer has to explicitly destroy the object when it is no longer useful.
// Need to create the object to achieve some goal
MyObject* ptr = new MyObject();
ptr->DoSomething(); // Use the object in some way
delete ptr; // Destroy the object. Done with it.
// Wait, what if DoSomething() raises an exception...?
A smart pointer by comparison defines a policy as to when the object is destroyed. You still have to create the object, but you no longer have to worry about destroying it.
SomeSmartPtr<MyObject> ptr(new MyObject());
ptr->DoSomething(); // Use the object in some way.
// Destruction of the object happens, depending
// on the policy the smart pointer class uses.
// Destruction would happen even if DoSomething()
// raises an exception
The simplest policy in use involves the scope of the smart pointer wrapper object, such as implemented by boost::scoped_ptr
or std::unique_ptr
.
void f()
{
{
std::unique_ptr<MyObject> ptr(new MyObject());
ptr->DoSomethingUseful();
} // ptr goes out of scope --
// the MyObject is automatically destroyed.
// ptr->Oops(); // Compile error: "ptr" not defined
// since it is no longer in scope.
}
Note that std::unique_ptr
instances cannot be copied. This prevents the pointer from being deleted multiple times (incorrectly). You can, however, pass references to it around to other functions you call.
std::unique_ptr
s are useful when you want to tie the lifetime of the object to a particular block of code, or if you embedded it as member data inside another object, the lifetime of that other object. The object exists until the containing block of code is exited, or until the containing object is itself destroyed.
A more complex smart pointer policy involves reference counting the pointer. This does allow the pointer to be copied. When the last "reference" to the object is destroyed, the object is deleted. This policy is implemented by boost::shared_ptr
and std::shared_ptr
.
void f()
{
typedef std::shared_ptr<MyObject> MyObjectPtr; // nice short alias
MyObjectPtr p1; // Empty
{
MyObjectPtr p2(new MyObject());
// There is now one "reference" to the created object
p1 = p2; // Copy the pointer.
// There are now two references to the object.
} // p2 is destroyed, leaving one reference to the object.
} // p1 is destroyed, leaving a reference count of zero.
// The object is deleted.
Reference counted pointers are very useful when the lifetime of your object is much more complicated, and is not tied directly to a particular section of code or to another object.
There is one drawback to reference counted pointers — the possibility of creating a dangling reference:
// Create the smart pointer on the heap
MyObjectPtr* pp = new MyObjectPtr(new MyObject())
// Hmm, we forgot to destroy the smart pointer,
// because of that, the object is never destroyed!
Another possibility is creating circular references:
struct Owner {
std::shared_ptr<Owner> other;
};
std::shared_ptr<Owner> p1 (new Owner());
std::shared_ptr<Owner> p2 (new Owner());
p1->other = p2; // p1 references p2
p2->other = p1; // p2 references p1
// Oops, the reference count of of p1 and p2 never goes to zero!
// The objects are never destroyed!
To work around this problem, both Boost and C++11 have defined a weak_ptr
to define a weak (uncounted) reference to a shared_ptr
.
Related Topic
- C++ – What does the explicit keyword mean
- C++ – the “–>” operator in C/C++
- C++ – the copy-and-swap idiom
- C++ – std::move(), and when should it be used
- C++ – What are rvalues, lvalues, xvalues, glvalues, and prvalues
- C++ – The Rule of Three
- C++ – What are the basic rules and idioms for operator overloading
- C++ – a lambda expression in C++11
Best Answer
I find it easiest to understand move semantics with example code. Let's start with a very simple string class which only holds a pointer to a heap-allocated block of memory:
Since we chose to manage the memory ourselves, we need to follow the rule of three. I am going to defer writing the assignment operator and only implement the destructor and the copy constructor for now:
The copy constructor defines what it means to copy string objects. The parameter
const string& that
binds to all expressions of type string which allows you to make copies in the following examples:Now comes the key insight into move semantics. Note that only in the first line where we copy
x
is this deep copy really necessary, because we might want to inspectx
later and would be very surprised ifx
had changed somehow. Did you notice how I just saidx
three times (four times if you include this sentence) and meant the exact same object every time? We call expressions such asx
"lvalues".The arguments in lines 2 and 3 are not lvalues, but rvalues, because the underlying string objects have no names, so the client has no way to inspect them again at a later point in time. rvalues denote temporary objects which are destroyed at the next semicolon (to be more precise: at the end of the full-expression that lexically contains the rvalue). This is important because during the initialization of
b
andc
, we could do whatever we wanted with the source string, and the client couldn't tell a difference!C++0x introduces a new mechanism called "rvalue reference" which, among other things, allows us to detect rvalue arguments via function overloading. All we have to do is write a constructor with an rvalue reference parameter. Inside that constructor we can do anything we want with the source, as long as we leave it in some valid state:
What have we done here? Instead of deeply copying the heap data, we have just copied the pointer and then set the original pointer to null (to prevent 'delete[]' from source object's destructor from releasing our 'just stolen data'). In effect, we have "stolen" the data that originally belonged to the source string. Again, the key insight is that under no circumstance could the client detect that the source had been modified. Since we don't really do a copy here, we call this constructor a "move constructor". Its job is to move resources from one object to another instead of copying them.
Congratulations, you now understand the basics of move semantics! Let's continue by implementing the assignment operator. If you're unfamiliar with the copy and swap idiom, learn it and come back, because it's an awesome C++ idiom related to exception safety.
Huh, that's it? "Where's the rvalue reference?" you might ask. "We don't need it here!" is my answer :)
Note that we pass the parameter
that
by value, sothat
has to be initialized just like any other string object. Exactly how isthat
going to be initialized? In the olden days of C++98, the answer would have been "by the copy constructor". In C++0x, the compiler chooses between the copy constructor and the move constructor based on whether the argument to the assignment operator is an lvalue or an rvalue.So if you say
a = b
, the copy constructor will initializethat
(because the expressionb
is an lvalue), and the assignment operator swaps the contents with a freshly created, deep copy. That is the very definition of the copy and swap idiom -- make a copy, swap the contents with the copy, and then get rid of the copy by leaving the scope. Nothing new here.But if you say
a = x + y
, the move constructor will initializethat
(because the expressionx + y
is an rvalue), so there is no deep copy involved, only an efficient move.that
is still an independent object from the argument, but its construction was trivial, since the heap data didn't have to be copied, just moved. It wasn't necessary to copy it becausex + y
is an rvalue, and again, it is okay to move from string objects denoted by rvalues.To summarize, the copy constructor makes a deep copy, because the source must remain untouched. The move constructor, on the other hand, can just copy the pointer and then set the pointer in the source to null. It is okay to "nullify" the source object in this manner, because the client has no way of inspecting the object again.
I hope this example got the main point across. There is a lot more to rvalue references and move semantics which I intentionally left out to keep it simple. If you want more details please see my supplementary answer.