C++ – Why and How Virtual Functions Are Slower

abstract classcfunctionspointers

Can anyone explain in detail, how exactly the virtual table works and what pointers are associated when virtual functions are called.

If they are actually slower, can you show the time that the virtual function takes to execute is more than normal class methods? It is easy to lose track of how/what is happening without seeing some code.

Best Answer

Virtual methods are commonly implemented via so-called virtual method tables (vtable for short), in which function pointers are stored. This adds indirection to the actual call (gotta fetch the address of the function to call from the vtable, then call it -- as opposed to just calling it right ahead). Of course, this takes some time and some more code.

However, it is not necessarily the primary cause of slowness. The real problem is that the compiler (generally/usually) cannot know which function will be called. So it can't inline it or perform any other such optimizations. This alone might add a dozen pointless instructions (preparing registers, calling, then restoring state afterwards), and might inhibit other, seemingly unrelated optimizations. Moreover, if you branch like crazy by calling many different implementations, you suffer the same hits you'd suffer from branching like crazy via other means: The cache and branch predictor won't help you, the branches will take longer than a perfectly predictable branch.

Big but: These performance hits are usually too tiny to matter. They're worth considering if you want to create a high-performance code and consider adding a virtual function that would be called at alarming frequency. However, also keep in mind that replacing virtual function calls with other means of branching (if .. else, switch, function pointers, etc.) won't solve the fundamental issue -- it may very well be slower. The problem (if it exists at all) isn't virtual functions but (unnecessary) indirection.

Edit: The difference in the call instructions is described in other answers. Basically, the code for a static ("normal") call is:

  • Copy some registers on the stack, to allow the called function to use those registers.
  • Copy the arguments into predefined locations, so that the called function can find them regardless from where it is called.
  • Push the return address.
  • Branch/jump to the function's code, which is a compile-time address and hence hardcoded in the binary by the compiler/linker.
  • Get the return value from a predefined location and restore registers we want to use.

A virtual call does exactly the same thing, except that the function address is not known at compile time. Instead, a couple of instructions ...

  • Get the vtable pointer, which points to an array of function pointers (function addresses), one for each virtual function, from the object.
  • Get the right function address from the vtable into a register (the index where the correct function address is stored is decided at compile-time).
  • Jump to the address in that register, rather than jumping to a hardcoded address.

As for branches: A branch is anything which jumps to another instruction instead of just letting the next instruction execute. This includes if, switch, parts of various loops, function calls, etc. and sometimes the compiler implements things that don't seem to branch in a way that actually needs a branch under the hood. See Why is processing a sorted array faster than an unsorted array? for why this may be slow, what CPUs do to counter this slowdown, and how this isn't a cure-all.