C++ – Do any compilers do this optimization for virtual calls

coptimizationvirtual-functions

This just came to mind, and not really sure how to search for this.

Let's say you have the following classes

class A
{
public:
    virtual void Foo() = 0;

    virtual void ManyFoo(int N) 
    {
        for (int i = 0; i < N; ++i) Foo();
    } 
};

class B : public A
{
public:
    virtual void Foo()
    {
        // Do something
    }
};

Do any compilers create a version of ManyFoo() for B that inlines the call to B::Foo()?

If not, does making B a final class enable this optimization?

Edit:
I was specifically wondering whether this was done anywhere for virtual calls (so, aside from when the whole call to ManyFoo() is inlined itself).

Best Answer

I believe the term you're looking for is "devirtualization".

Anyway, did you try it? If we put that example in Compiler Explorer:

extern void extCall ();

class A
{
public:
    virtual void Foo() const = 0;

    virtual void ManyFoo(int N) const
    {
        for (int i = 0; i < N; ++i) Foo();
    } 
};

class B final : public A
{
public:
    virtual void Foo() const
    {
        extCall ();
    }
};

void b_value_foo (B b) {
    b.ManyFoo (6);
}

void b_ref_foo (B const & b) {
    b.ManyFoo (6);
}

void b_indirect_foo (B b) {
    b_ref_foo (b);
}

...GCC is able to produce the following with -Os:

b_value_foo(B):
        push    rax
        call    extCall()
        call    extCall()
        call    extCall()
        call    extCall()
        call    extCall()
        pop     rdx
        jmp     extCall()
b_ref_foo(B const&):
        mov     rax, QWORD PTR [rdi]
        mov     esi, 6
        mov     rax, QWORD PTR [rax+8]
        jmp     rax
b_indirect_foo(B):
        jmp     b_ref_foo(B const&)

It will inline through the virtual call when it's 100% sure of the concrete type of the object b (n.b. if we change -Os to -O2 it will also fully inline b_indirect_foo). But it can't be sure of the concrete type of an object it can only see by a reference that it can't trace back to an instance, and it doesn't seem to trust final annotations to overrule this (probably because this would be very ABI-fragile; I personally wouldn't want it to). It will trust final annotations on member functions though, but your example precludes that by its structure.

GCC has had this optimization for several versions. Clang and MSVC don't seem to do it in this case (but do advertise the feature), so the power clearly varies a lot between examples and compilers.