C++98 and C++03
This answer is for the older versions of the C++ standard. The C++11 and C++14 versions of the standard do not formally contain 'sequence points'; operations are 'sequenced before' or 'unsequenced' or 'indeterminately sequenced' instead. The net effect is essentially the same, but the terminology is different.
Disclaimer : Okay. This answer is a bit long. So have patience while reading it. If you already know these things, reading them again won't make you crazy.
Pre-requisites : An elementary knowledge of C++ Standard
What are Sequence Points?
The Standard says
At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations
shall be complete and no side effects of subsequent evaluations shall have taken place. (§1.9/7)
Side effects? What are side effects?
Evaluation of an expression produces something and if in addition there is a change in the state of the execution environment it is said that the expression (its evaluation) has some side effect(s).
For example:
int x = y++; //where y is also an int
In addition to the initialization operation the value of y
gets changed due to the side effect of ++
operator.
So far so good. Moving on to sequence points. An alternation definition of seq-points given by the comp.lang.c author Steve Summit
:
Sequence point is a point in time at which the dust has settled and all side effects which have been seen so far are guaranteed to be complete.
What are the common sequence points listed in the C++ Standard ?
Those are:
at the end of the evaluation of full expression (§1.9/16
) (A full-expression is an expression that is not a subexpression of another expression.)1
Example :
int a = 5; // ; is a sequence point here
in the evaluation of each of the following expressions after the evaluation of the first expression (§1.9/18
) 2
a && b (§5.14)
a || b (§5.15)
a ? b : c (§5.16)
a , b (§5.18)
(here a , b is a comma operator; in func(a,a++)
,
is not a comma operator, it's merely a separator between the arguments a
and a++
. Thus the behaviour is undefined in that case (if a
is considered to be a primitive type))
at a function call (whether or not the function is inline), after the evaluation of all function arguments (if any) which
takes place before execution of any expressions or statements in the function body (§1.9/17
).
1 : Note : the evaluation of a full-expression can include the evaluation of subexpressions that are not lexically
part of the full-expression. For example, subexpressions involved in evaluating default argument expressions (8.3.6) are considered to be created in the expression that calls the function, not the expression that defines the default argument
2 : The operators indicated are the built-in operators, as described in clause 5. When one of these operators is overloaded (clause 13) in a valid context, thus designating a user-defined operator function, the expression designates a function invocation and the operands form an argument list, without an implied sequence point between them.
What is Undefined Behaviour?
The Standard defines Undefined Behaviour in Section §1.3.12
as
behavior, such as might arise upon use of an erroneous program construct or erroneous data, for which this International Standard imposes no requirements 3.
Undefined behavior may also be expected when this
International Standard omits the description of any explicit definition of behavior.
3 : permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or with-
out the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
In short, undefined behaviour means anything can happen from daemons flying out of your nose to your girlfriend getting pregnant.
What is the relation between Undefined Behaviour and Sequence Points?
Before I get into that you must know the difference(s) between Undefined Behaviour, Unspecified Behaviour and Implementation Defined Behaviour.
You must also know that the order of evaluation of operands of individual operators and subexpressions of individual expressions, and the order in which side effects take place, is unspecified
.
For example:
int x = 5, y = 6;
int z = x++ + y++; //it is unspecified whether x++ or y++ will be evaluated first.
Another example here.
Now the Standard in §5/4
says
- 1) Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression.
What does it mean?
Informally it means that between two sequence points a variable must not be modified more than once.
In an expression statement, the next sequence point
is usually at the terminating semicolon, and the previous sequence point
is at the end of the previous statement. An expression may also contain intermediate sequence points
.
From the above sentence the following expressions invoke Undefined Behaviour:
i++ * ++i; // UB, i is modified more than once btw two SPs
i = ++i; // UB, same as above
++i = 2; // UB, same as above
i = ++i + 1; // UB, same as above
++++++i; // UB, parsed as (++(++(++i)))
i = (i, ++i, ++i); // UB, there's no SP between `++i` (right most) and assignment to `i` (`i` is modified more than once btw two SPs)
But the following expressions are fine:
i = (i, ++i, 1) + 1; // well defined (AFAIK)
i = (++i, i++, i); // well defined
int j = i;
j = (++i, i++, j*i); // well defined
- 2) Furthermore, the prior value shall be accessed only to determine the value to be stored.
What does it mean? It means if an object is written to within a full expression, any and all accesses to it within the same expression must be directly involved in the computation of the value to be written.
For example in i = i + 1
all the access of i
(in L.H.S and in R.H.S) are directly involved in computation of the value to be written. So it is fine.
This rule effectively constrains legal expressions to those in which the accesses demonstrably precede the modification.
Example 1:
std::printf("%d %d", i,++i); // invokes Undefined Behaviour because of Rule no 2
Example 2:
a[i] = i++ // or a[++i] = i or a[i++] = ++i etc
is disallowed because one of the accesses of i
(the one in a[i]
) has nothing to do with the value which ends up being stored in i (which happens over in i++
), and so there's no good way to define--either for our understanding or the compiler's--whether the access should take place before or after the incremented value is stored. So the behaviour is undefined.
Example 3 :
int x = i + i++ ;// Similar to above
Follow up answer for C++11 here.
Common operators to overload
Most of the work in overloading operators is boiler-plate code. That is little wonder, since operators are merely syntactic sugar, their actual work could be done by (and often is forwarded to) plain functions. But it is important that you get this boiler-plate code right. If you fail, either your operator’s code won’t compile or your users’ code won’t compile or your users’ code will behave surprisingly.
Assignment Operator
There's a lot to be said about assignment. However, most of it has already been said in GMan's famous Copy-And-Swap FAQ, so I'll skip most of it here, only listing the perfect assignment operator for reference:
X& X::operator=(X rhs)
{
swap(rhs);
return *this;
}
Bitshift Operators (used for Stream I/O)
The bitshift operators <<
and >>
, although still used in hardware interfacing for the bit-manipulation functions they inherit from C, have become more prevalent as overloaded stream input and output operators in most applications. For guidance overloading as bit-manipulation operators, see the section below on Binary Arithmetic Operators. For implementing your own custom format and parsing logic when your object is used with iostreams, continue.
The stream operators, among the most commonly overloaded operators, are binary infix operators for which the syntax specifies no restriction on whether they should be members or non-members.
Since they change their left argument (they alter the stream’s state), they should, according to the rules of thumb, be implemented as members of their left operand’s type. However, their left operands are streams from the standard library, and while most of the stream output and input operators defined by the standard library are indeed defined as members of the stream classes, when you implement output and input operations for your own types, you cannot change the standard library’s stream types. That’s why you need to implement these operators for your own types as non-member functions.
The canonical forms of the two are these:
std::ostream& operator<<(std::ostream& os, const T& obj)
{
// write obj to stream
return os;
}
std::istream& operator>>(std::istream& is, T& obj)
{
// read obj from stream
if( /* no valid object of T found in stream */ )
is.setstate(std::ios::failbit);
return is;
}
When implementing operator>>
, manually setting the stream’s state is only necessary when the reading itself succeeded, but the result is not what would be expected.
Function call operator
The function call operator, used to create function objects, also known as functors, must be defined as a member function, so it always has the implicit this
argument of member functions. Other than this, it can be overloaded to take any number of additional arguments, including zero.
Here's an example of the syntax:
class foo {
public:
// Overloaded call operator
int operator()(const std::string& y) {
// ...
}
};
Usage:
foo f;
int a = f("hello");
Throughout the C++ standard library, function objects are always copied. Your own function objects should therefore be cheap to copy. If a function object absolutely needs to use data which is expensive to copy, it is better to store that data elsewhere and have the function object refer to it.
Comparison operators
The binary infix comparison operators should, according to the rules of thumb, be implemented as non-member functions1. The unary prefix negation !
should (according to the same rules) be implemented as a member function. (but it is usually not a good idea to overload it.)
The standard library’s algorithms (e.g. std::sort()
) and types (e.g. std::map
) will always only expect operator<
to be present. However, the users of your type will expect all the other operators to be present, too, so if you define operator<
, be sure to follow the third fundamental rule of operator overloading and also define all the other boolean comparison operators. The canonical way to implement them is this:
inline bool operator==(const X& lhs, const X& rhs){ /* do actual comparison */ }
inline bool operator!=(const X& lhs, const X& rhs){return !operator==(lhs,rhs);}
inline bool operator< (const X& lhs, const X& rhs){ /* do actual comparison */ }
inline bool operator> (const X& lhs, const X& rhs){return operator< (rhs,lhs);}
inline bool operator<=(const X& lhs, const X& rhs){return !operator> (lhs,rhs);}
inline bool operator>=(const X& lhs, const X& rhs){return !operator< (lhs,rhs);}
The important thing to note here is that only two of these operators actually do anything, the others are just forwarding their arguments to either of these two to do the actual work.
The syntax for overloading the remaining binary boolean operators (||
, &&
) follows the rules of the comparison operators. However, it is very unlikely that you would find a reasonable use case for these2.
1 As with all rules of thumb, sometimes there might be reasons to break this one, too. If so, do not forget that the left-hand operand of the binary comparison operators, which for member functions will be *this
, needs to be const
, too. So a comparison operator implemented as a member function would have to have this signature:
bool operator<(const X& rhs) const { /* do actual comparison with *this */ }
(Note the const
at the end.)
2 It should be noted that the built-in version of ||
and &&
use shortcut semantics. While the user defined ones (because they are syntactic sugar for method calls) do not use shortcut semantics. User will expect these operators to have shortcut semantics, and their code may depend on it, Therefore it is highly advised NEVER to define them.
Arithmetic Operators
Unary arithmetic operators
The unary increment and decrement operators come in both prefix and postfix flavor. To tell one from the other, the postfix variants take an additional dummy int argument. If you overload increment or decrement, be sure to always implement both prefix and postfix versions.
Here is the canonical implementation of increment, decrement follows the same rules:
class X {
X& operator++()
{
// do actual increment
return *this;
}
X operator++(int)
{
X tmp(*this);
operator++();
return tmp;
}
};
Note that the postfix variant is implemented in terms of prefix. Also note that postfix does an extra copy.2
Overloading unary minus and plus is not very common and probably best avoided. If needed, they should probably be overloaded as member functions.
2 Also note that the postfix variant does more work and is therefore less efficient to use than the prefix variant. This is a good reason to generally prefer prefix increment over postfix increment. While compilers can usually optimize away the additional work of postfix increment for built-in types, they might not be able to do the same for user-defined types (which could be something as innocently looking as a list iterator). Once you got used to do i++
, it becomes very hard to remember to do ++i
instead when i
is not of a built-in type (plus you'd have to change code when changing a type), so it is better to make a habit of always using prefix increment, unless postfix is explicitly needed.
Binary arithmetic operators
For the binary arithmetic operators, do not forget to obey the third basic rule operator overloading: If you provide +
, also provide +=
, if you provide -
, do not omit -=
, etc. Andrew Koenig is said to have been the first to observe that the compound assignment operators can be used as a base for their non-compound counterparts. That is, operator +
is implemented in terms of +=
, -
is implemented in terms of -=
etc.
According to our rules of thumb, +
and its companions should be non-members, while their compound assignment counterparts (+=
etc.), changing their left argument, should be a member. Here is the exemplary code for +=
and +
; the other binary arithmetic operators should be implemented in the same way:
class X {
X& operator+=(const X& rhs)
{
// actual addition of rhs to *this
return *this;
}
};
inline X operator+(X lhs, const X& rhs)
{
lhs += rhs;
return lhs;
}
operator+=
returns its result per reference, while operator+
returns a copy of its result. Of course, returning a reference is usually more efficient than returning a copy, but in the case of operator+
, there is no way around the copying. When you write a + b
, you expect the result to be a new value, which is why operator+
has to return a new value.3
Also note that operator+
takes its left operand by copy rather than by const reference. The reason for this is the same as the reason giving for operator=
taking its argument per copy.
The bit manipulation operators ~
&
|
^
<<
>>
should be implemented in the same way as the arithmetic operators. However, (except for overloading <<
and >>
for output and input) there are very few reasonable use cases for overloading these.
3 Again, the lesson to be taken from this is that a += b
is, in general, more efficient than a + b
and should be preferred if possible.
Array Subscripting
The array subscript operator is a binary operator which must be implemented as a class member. It is used for container-like types that allow access to their data elements by a key.
The canonical form of providing these is this:
class X {
value_type& operator[](index_type idx);
const value_type& operator[](index_type idx) const;
// ...
};
Unless you do not want users of your class to be able to change data elements returned by operator[]
(in which case you can omit the non-const variant), you should always provide both variants of the operator.
If value_type is known to refer to a built-in type, the const variant of the operator should better return a copy instead of a const reference:
class X {
value_type& operator[](index_type idx);
value_type operator[](index_type idx) const;
// ...
};
Operators for Pointer-like Types
For defining your own iterators or smart pointers, you have to overload the unary prefix dereference operator *
and the binary infix pointer member access operator ->
:
class my_ptr {
value_type& operator*();
const value_type& operator*() const;
value_type* operator->();
const value_type* operator->() const;
};
Note that these, too, will almost always need both a const and a non-const version.
For the ->
operator, if value_type
is of class
(or struct
or union
) type, another operator->()
is called recursively, until an operator->()
returns a value of non-class type.
The unary address-of operator should never be overloaded.
For operator->*()
see this question. It's rarely used and thus rarely ever overloaded. In fact, even iterators do not overload it.
Continue to Conversion Operators
Best Answer
In the following text, I will distinguish between scoped objects, whose time of destruction is statically determined by their enclosing scope (functions, blocks, classes, expressions), and dynamic objects, whose exact time of destruction is generally not known until runtime.
While the destruction semantics of class objects are determined by destructors, the destruction of a scalar object is always a no-op. Specifically, destructing a pointer variable does not destroy the pointee.
Scoped objects
automatic objects
Automatic objects (commonly referred to as "local variables") are destructed, in reverse order of their definition, when control flow leaves the scope of their definition:
If an exception is thrown during the execution of a function, all previously constructed automatic objects are destructed before the exception is propagated to the caller. This process is called stack unwinding. During stack unwinding, no further exceptions may leave the destructors of the aforementioned previously constructed automatic objects. Otherwise, the function
std::terminate
is called.This leads to one of the most important guidelines in C++:
non-local static objects
Static objects defined at namespace scope (commonly referred to as "global variables") and static data members are destructed, in reverse order of their definition, after the execution of
main
:Note that the relative order of construction (and destruction) of static objects defined in different translation units is undefined.
If an exception leaves the destructor of a static object, the function
std::terminate
is called.local static objects
Static objects defined inside functions are constructed when (and if) control flow passes through their definition for the first time.1 They are destructed in reverse order after the execution of
main
:If an exception leaves the destructor of a static object, the function
std::terminate
is called.1: This is an extremely simplified model. The initialization details of static objects are actually much more complicated.
base class subobjects and member subobjects
When control flow leaves the destructor body of an object, its member subobjects (also known as its "data members") are destructed in reverse order of their definition. After that, its base class subobjects are destructed in reverse order of the base-specifier-list:
If an exception is thrown during the construction of one of
Foo
's subobjects, then all its previously constructed subobjects will be destructed before the exception is propagated. TheFoo
destructor, on the other hand, will not be executed, since theFoo
object was never fully constructed.Note that the destructor body is not responsible for destructing the data members themselves. You only need to write a destructor if a data member is a handle to a resource that needs to be released when the object is destructed (such as a file, a socket, a database connection, a mutex, or heap memory).
array elements
Array elements are destructed in descending order. If an exception is thrown during the construction of the n-th element, the elements n-1 to 0 are destructed before the exception is propagated.
temporary objects
A temporary object is constructed when a prvalue expression of class type is evaluated. The most prominent example of a prvalue expression is the call of a function that returns an object by value, such as
T operator+(const T&, const T&)
. Under normal circumstances, the temporary object is destructed when the full-expression that lexically contains the prvalue is completely evaluated:The above function call
some_function(a + " " + b)
is a full-expression because it is not part of a larger expression (instead, it is part of an expression-statement). Hence, all temporary objects that are constructed during the evaluation of the subexpressions will be destructed at the semicolon. There are two such temporary objects: the first is constructed during the first addition, and the second is constructed during the second addition. The second temporary object will be destructed before the first.If an exception is thrown during the second addition, the first temporary object will be destructed properly before propagating the exception.
If a local reference is initialized with a prvalue expression, the lifetime of the temporary object is extended to the scope of the local reference, so you won't get a dangling reference:
If a prvalue expression of non-class type is evaluated, the result is a value, not a temporary object. However, a temporary object will be constructed if the prvalue is used to initialize a reference:
Dynamic objects and arrays
In the following section, destroy X means "first destruct X and then release the underlying memory". Similarly, create X means "first allocate enough memory and then construct X there".
dynamic objects
A dynamic object created via
p = new Foo
is destroyed viadelete p
. If you forget todelete p
, you have a resource leak. You should never attempt to do one of the following, since they all lead to undefined behavior:delete[]
(note the square brackets),free
or any other meansIf an exception is thrown during the construction of a dynamic object, the underlying memory is released before the exception is propagated. (The destructor will not be executed prior to memory release, because the object was never fully constructed.)
dynamic arrays
A dynamic array created via
p = new Foo[n]
is destroyed viadelete[] p
(note the square brackets). If you forget todelete[] p
, you have a resource leak. You should never attempt to do one of the following, since they all lead to undefined behavior:delete
,free
or any other meansIf an exception is thrown during the construction of the n-th element, the elements n-1 to 0 are destructed in descending order, the underlying memory is released, and the exception is propagated.
(You should generally prefer
std::vector<Foo>
overFoo*
for dynamic arrays. It makes writing correct and robust code much easier.)reference-counting smart pointers
A dynamic object managed by several
std::shared_ptr<Foo>
objects is destroyed during the destruction of the laststd::shared_ptr<Foo>
object involved in sharing that dynamic object.(You should generally prefer
std::shared_ptr<Foo>
overFoo*
for shared objects. It makes writing correct and robust code much easier.)