C++ – How much do forward declarations affect compile time

ccompilation

I am very interested in some studies or empirical data that shows a comparison of compilation times between two c++ projects that are the same except one uses forward declarations where possible and the other uses none.

How drastically can forward declarations change compilation time as compared to full includes?

#include "myClass.h"

vs.

class myClass;

Are there any studies that examine this?

I realize that this is a vague question that greatly depends on the project. I don't expect a hard number for an answer. Rather, I'm hoping someone may be able to direct me to a study about this.

The project I'm specifically worried about has about 1200 files. Each cpp on average has 5 headers included. Each header has on average 5 headers included. This regresses about 4 levels deep. It would seem that for each cpp compiled, around 300 headers must be opened and parsed, some many times. (There are many duplicates in the include tree.) There are guards, but the files are still opened. Each cpp is separately compiled with gcc, so there's no header caching.

To be sure no one misunderstands, I certainly advocate using forward declarations where possible. My employer, however, has banned them. I'm trying to argue against that position.

Thank you for any information.

Best Answer

Forward declarations can make for neater more understandable code which HAS to be the goal of any decision surely.

Couple that with the fact that when it comes to classes its quite possible for 2 classes to rely upon each other which makes it a bit hard to NOT use forward declaration without causing a nightmare.

Equally forward declaration of classes in a header means that you only need to include the relevant headers in the CPPs that actually USE those classes. That actually DECREASES compile time.

Edit: Given your comment above I would point out it is ALWAYS slower to include a header file than to forward declare. Any time you include a header you are necessitating a load from disk often only to find out that the header guards mean that nothing happens. That would waste immense amounts of time and is really a VERY stupid rule to be bringing in.

Edit 2: Hard data is pretty hard to obtain. Anecdotally, I once worked on a project that wasn't strict about its header includes and the build time was roughly 45 minute on a 512MB RAM P3-500Mhz (This was a while back). After spending 2 weeks cutting down the include nightmare (By using forward declarations) I had managed to get the code to build in a little under 4 minutes. Subsequently using forward declarations became a rule whenever possible.

Edit 3: Its also worth bearing in mind that there is a huge advantage from using forward declarations when it comes to making small modifications to your code. If headers are included all over the shop then a modification to a header file can cause vast amounts of files to be rebuilt.

I also note lots of other people extolling the virtues of pre-compiled headers (PCHs). They have their place and they can really help but they really shouldn't be used as an alternative to proper forward declaration. Otherwise modifications to header files can cause issues with recompilation of lots of files (as mentioned above) as well as triggering a PCH rebuild. PCHs can provide a big win for things like libraries that are pre-built but they are no reason not to use proper forward declarations.

Related Solutions

C++ – Templates: Use forward declarations to reduce compile time

You can't forward declare "parts" of classes like that. Even if you could, you'd still need to instantiate the code somewhere so you could link against it. There are ways to handle it, you could make yourself a little library with instantiations of common containers (e.g. vector) and link them in. Then you'd only ever need to compile e.g. vector<int> once. To implement this you'll need to use something like -fno-implicit-templates, at least assuming you are sticking with g++ and explicitly instantiate the template in your lib with template class std::vector<int>

So, a real working example. Here I have 2 files, a.cpp and b.cpp

a.cpp:

#include <vector> // still need to know the interface
#include <cstdlib>

int main(int argc, char **argv) {
  std::vector<int>* vec = new std::vector<int>();
  vec->push_back(3);
  delete vec;
  return EXIT_SUCCESS;
}

So now I can compile a.cpp with -fno-implicit-templates:

g++ -fno-implicit-templates -c a.cpp

This will give me a.o. If I then I try to link a.o I get:

g++ a.o
/usr/bin/ld: Undefined symbols:
std::vector<int, std::allocator<int> >::_M_insert_aux(__gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > >, int const&)
void std::_Destroy<int*, std::allocator<int> >(int*, int*, std::allocator<int>)
collect2: ld returned 1 exit status

No good. So we turn to b.cpp:

#include <vector>
template class std::vector<int>;
template void std::_Destroy(int*,int*, std::allocator<int>);
template void std::__uninitialized_fill_n_a(__gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > >, unsigned long, int const&, std::allocator<int>);
template void std::__uninitialized_fill_n_a(int*, unsigned long, int const&, std::allocator<int>);
template void std::fill(__gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > >, __gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > >, int const&);
template __gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > > std::fill_n(__gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > >, unsigned long, int const&);
template int* std::fill_n(int*, unsigned long, int const&);
template void std::_Destroy(__gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > >, __gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > >, std::allocator<int>);

Now you're saying to yourself, where did all these extra template things come from? I see the template class std::vector<int> and that's fine, but what about the rest of it? Well the short answer is that, these things implementations are by necessity a little messy, and when you manually instantiate them, by extension some of this messiness leaks out. You're probably wondering how I even figured out what I needed to instantiate. Well I used the linker errors ;).

So now we compile b.cpp

g++ -fno-implicit-templates -c b.cpp

And we get b.o. Linking a.o and b.o we can get

g++ a.o b.o

Hooray, no linker errors.

So, to get into some details about your updated question, if this is a home brewed class it doesn't necessarily have to be this messy. For instance, you can separate the interface from the implementation, e.g. say we have c.h, c.cpp, in addition to a.cpp and b.cpp

c.h

template<typename T>
class MyExample {
  T m_t;
  MyExample(const T& t);
  T get();
  void set(const T& t);
};

c.cpp

template<typename T>
MyExample<T>::MyExample(const T& t) : m_t(t) {}
template<typename T>
T MyExample<T>::get() { return m_t; }
template<typename T>
void MyExample<T>::set(const T& t) { m_t = t; }

a.cpp

 #include "c.h" // only need interface
 #include <iostream>
 int main() {
   MyExample<int> x(10);
   std::cout << x.get() << std::endl;
   x.set( 9 );
   std::cout << x.get() << std::endl;
   return EXIT_SUCCESS;
 }

b.cpp, the "library":

 #include "c.h" // need interface
 #include "c.cpp" // need implementation to actually instantiate it
 template class MyExample<int>;

Now you compile b.cpp to b.o once. When a.cpp changes you just need to recompile that and link in b.o.

C++ – Undefined behavior and sequence points

C++98 and C++03

This answer is for the older versions of the C++ standard. The C++11 and C++14 versions of the standard do not formally contain 'sequence points'; operations are 'sequenced before' or 'unsequenced' or 'indeterminately sequenced' instead. The net effect is essentially the same, but the terminology is different.

Disclaimer : Okay. This answer is a bit long. So have patience while reading it. If you already know these things, reading them again won't make you crazy.

Pre-requisites : An elementary knowledge of C++ Standard

What are Sequence Points?

The Standard says

At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place. (§1.9/7)

Side effects? What are side effects?

Evaluation of an expression produces something and if in addition there is a change in the state of the execution environment it is said that the expression (its evaluation) has some side effect(s).

For example:

int x = y++; //where y is also an int

In addition to the initialization operation the value of y gets changed due to the side effect of ++ operator.

So far so good. Moving on to sequence points. An alternation definition of seq-points given by the comp.lang.c author Steve Summit:

Sequence point is a point in time at which the dust has settled and all side effects which have been seen so far are guaranteed to be complete.

What are the common sequence points listed in the C++ Standard ?

Those are:

at the end of the evaluation of full expression (§1.9/16) (A full-expression is an expression that is not a subexpression of another expression.)¹

Example :
```
int a = 5; // ; is a sequence point here
```
in the evaluation of each of the following expressions after the evaluation of the first expression (§1.9/18) ²
- a && b (§5.14)
- a || b (§5.15)
- a ? b : c (§5.16)
- a , b (§5.18) (here a , b is a comma operator; in func(a,a++) , is not a comma operator, it's merely a separator between the arguments a and a++. Thus the behaviour is undefined in that case (if a is considered to be a primitive type))
at a function call (whether or not the function is inline), after the evaluation of all function arguments (if any) which takes place before execution of any expressions or statements in the function body (§1.9/17).

_{1 : Note : the evaluation of a full-expression can include the evaluation of subexpressions that are not lexically
part of the full-expression. For example, subexpressions involved in evaluating default argument expressions (8.3.6) are considered to be created in the expression that calls the function, not the expression that defines the default argument}

_{2 : The operators indicated are the built-in operators, as described in clause 5. When one of these operators is overloaded (clause 13) in a valid context, thus designating a user-defined operator function, the expression designates a function invocation and the operands form an argument list, without an implied sequence point between them.}

What is Undefined Behaviour?

The Standard defines Undefined Behaviour in Section §1.3.12 as

behavior, such as might arise upon use of an erroneous program construct or erroneous data, for which this International Standard imposes no requirements ³.

Undefined behavior may also be expected when this International Standard omits the description of any explicit definition of behavior.

_{3 : permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or with-
out the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).}

In short, undefined behaviour means anything can happen from daemons flying out of your nose to your girlfriend getting pregnant.

What is the relation between Undefined Behaviour and Sequence Points?

Before I get into that you must know the difference(s) between Undefined Behaviour, Unspecified Behaviour and Implementation Defined Behaviour.

You must also know that the order of evaluation of operands of individual operators and subexpressions of individual expressions, and the order in which side effects take place, is unspecified.

For example:

int x = 5, y = 6;

int z = x++ + y++; //it is unspecified whether x++ or y++ will be evaluated first.

Another example here.

Now the Standard in §5/4 says

1) Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression.

What does it mean?

Informally it means that between two sequence points a variable must not be modified more than once. In an expression statement, the next sequence point is usually at the terminating semicolon, and the previous sequence point is at the end of the previous statement. An expression may also contain intermediate sequence points.

From the above sentence the following expressions invoke Undefined Behaviour:

i++ * ++i;   // UB, i is modified more than once btw two SPs
i = ++i;     // UB, same as above
++i = 2;     // UB, same as above
i = ++i + 1; // UB, same as above
++++++i;     // UB, parsed as (++(++(++i)))

i = (i, ++i, ++i); // UB, there's no SP between `++i` (right most) and assignment to `i` (`i` is modified more than once btw two SPs)

But the following expressions are fine:

i = (i, ++i, 1) + 1; // well defined (AFAIK)
i = (++i, i++, i);   // well defined 
int j = i;
j = (++i, i++, j*i); // well defined

2) Furthermore, the prior value shall be accessed only to determine the value to be stored.

What does it mean? It means if an object is written to within a full expression, any and all accesses to it within the same expression must be directly involved in the computation of the value to be written.

For example in i = i + 1 all the access of i (in L.H.S and in R.H.S) are directly involved in computation of the value to be written. So it is fine.

This rule effectively constrains legal expressions to those in which the accesses demonstrably precede the modification.

Example 1:

std::printf("%d %d", i,++i); // invokes Undefined Behaviour because of Rule no 2

Example 2:

a[i] = i++ // or a[++i] = i or a[i++] = ++i etc

is disallowed because one of the accesses of i (the one in a[i]) has nothing to do with the value which ends up being stored in i (which happens over in i++), and so there's no good way to define--either for our understanding or the compiler's--whether the access should take place before or after the incremented value is stored. So the behaviour is undefined.

Example 3 :

int x = i + i++ ;// Similar to above

Follow up answer for C++11 here.