C# and C++ – Implementing Pure Abstract Classes and Interfaces

abstract classcinterfacesoptimization

Although this isn't mandatory in the C++ standard, it seems the way GCC for example, implements parent classes, including pure abstract ones, is by including a pointer to the v-table for that abstract class in every instantiation of the class in question.

Naturally this bloats the size of every instance of this class by a pointer for every parent class it has.

But I've noticed that many C# classes and structs have lots of parent interfaces, which are basically pure abstract classes. I would be surprised if every instance of say Decimal, was bloated with 6 pointers to all it's various interfaces.

So if C# does do interfaces differently, how does it do them, at least in a typical implementation (I understand the standard itself may not define such an implementation)? And do any C++ implementations have a way of avoiding object size bloat when adding pure virtual parents to classes?

Best Answer

In C# and Java implementations, the objects typically have a single pointer to its class. This is possible because they are single-inheritance languages. The class structure then contains the vtable for the single-inheritance hierarchy. But calling interface methods has all the problems of multiple inheritance as well. This is typically solved by putting additional vtables for all implemented interfaces into the class structure. This saves space compared to typical virtual inheritance implementations in C++, but makes interface method dispatch more complicated – which can be partially compensated by caching.

E.g. in the OpenJDK JVM, each class contains an array of vtables for all implemented interfaces (an interface vtable is called an itable). When an interface method is called, this array is searched linearly for the itable of that interface, then the method can be dispatched through that itable. Caching is used so that each call site remembers the result of the method dispatch, so this search only has to be repeated when the concrete object type changes. Pseudocode for method dispatch:

// Dispatch SomeInterface.method
Method const* resolve_method(
    Object const* instance, Klass const* interface, uint itable_slot) {

  Klass const* klass = instance->klass;

  for (Itable const* itable : klass->itables()) {
    if (itable->klass() == interface)
      return itable[itable_slot];
  }

  throw ...;  // class does not implement required interface
}

(Compare the real code in the OpenJDK HotSpot interpreter or x86 compiler.)

C# (or more precisely, the CLR) uses a related approach. However, here the itables don't contain pointers to the methods, but are slot maps: they point to entries in the main vtable of the class. As with Java, having to search for the correct itable is only the worst case scenario, and it is expected that caching at the call site can avoid this search nearly always. The CLR uses a technique called Virtual Stub Dispatch in order to patch the JIT-compiled machine code with different caching strategies. Pseudocode:

Method const* resolve_method(
    Object const* instance, Klass const* interface, uint interface_slot) {

  Klass const* klass = instance->klass;

  // Walk all base classes to find slot map
  for (Klass const* base = klass; base != nullptr; base = base->base()) {
    // I think the CLR actually uses hash tables instead of a linear search
    for (SlotMap const* slot_map : base->slot_maps()) {
      if (slot_map->klass() == interface) {
        uint vtable_slot = slot_map[interface_slot];
        return klass->vtable[vtable_slot];
      }
    }
  }

  throw ...;  // class does not implement required interface
}

The main difference to the OpenJDK-pseudocode is that in OpenJDK each class has an array of all directly or indirectly implemented interfaces, whereas the CLR only keeps an array of slot maps for interfaces that were directly implemented in that class. We therefore need to walk the inheritance hierarchy upwards until a slot map is found. For deep inheritance hierarchies this results in space savings. These are particularly relevant in CLR due to the way how generics are implemented: for a generic specialization, the class structure is copied and methods in the main vtable may be replaced by specializations. The slot maps continue to point at the correct vtable entries and can therefore be shared between all generic specializations of a class.

As an ending note, there are more possibilities to implement interface dispatch. Instead of placing the vtable/itable pointer in the object or in the class structure, we can use fat pointers to the object, that are basically a (Object*, VTable*) pair. The drawback is that this doubles the size of pointers and that upcasts (from a concrete type to an interface type) are not free. But it's more flexible, has less indirection, and also means that interfaces can be implemented externally from a class. Related approaches are used by Go interfaces, Rust traits, and Haskell typeclasses.

References and further reading:

Interface dispatch. An expanded version of this answer, complete with diagrams and in-depth discussion.
Wikipedia: Inline caching. Discusses caching approaches that can be used to avoid expensive method lookup. Typically not needed for vtable-based dispatch, but very desirable for more expensive dispatch mechanisms like the above interface dispatch strategies.
OpenJDK Wiki (2013): Interface Calls. Discusses itables.
Pobar, Neward (2009): SSCLI 2.0 Internals. Chapter 5 of the book discusses slot maps in great detail. Was never published but made available by the authors on their blogs. The PDF link has since moved. This book probably no longer reflects the current state of the CLR.
CoreCLR (2006): Virtual Stub Dispatch. In: Book Of The Runtime. Discusses slot maps and caching to avoid expensive lookups.
Kennedy, Syme (2001): Design and Implementation of Generics for the .NET Common Language Runtime. (PDF link). Discusses various approaches to implement generics. Generics interact with method dispatch because methods might be specialized so vtables might have to be rewritten.

Related Solutions

C# Design Patterns – Using Abstract Base Class with Interfaces as Behaviors

[1] Add virtual "getters" & "setters" to your properties, I had to hack another control library, because I need this feature:

abstract class Control
{
    // internal fields for properties
    protected int _Width;
    protected int _Height;
    protected int _BackColor;
    protected int _X;
    protected int _Y;
    protected bool Visible;

    protected int BorderWidth;
    protected int BorderColor;


    // getters & setters virtual !!!

    public virtual int getWidth(...) { ... }
    public virtual void setWidth(...) { ... }

    public virtual int getHeight(...) { ... }
    public virtual void setHeight(...) { ... }

    public virtual int getBackColor(...) { ... }
    public virtual void setBackColor(...) { ... }

    public virtual int getX(...) { ... }
    public virtual void setX(...) { ... }

    public virtual int getY(...) { ... }
    public virtual void setY(...) { ... }

    public virtual int getBorderWidth(...) { ... }
    public virtual void setBorderWidth(...) { ... }

    public virtual int getBorderColor(...) { ... }
    public virtual void setBorderColor(...) { ... }

    public virtual bool getVisible(...) { ... }
    public virtual void setVisible(...) { ... }

    // properties WITH virtual getters & setters

    public int Width { get getWidth(); set setWidth(value); }

    public int Height { get getHeight(); set setHeight(value); }

    public int BackColor { get getBackColor(); set setBackColor(value); }

    public int X { get getX(); set setX(value); }

    public int Y { get getY(); set setY(value); }

    public int BorderWidth { get getBorderWidth(); set setBorderWidth(value); }

    public int BorderColor { get getBorderColor(); set setBorderColor(value); }

    public bool Visible { get getVisible(); set setVisible(value); }

    // other methods

    public Rectangle ClipRectange { get; protected set; }   
    abstract public void Draw();
} // class Control

/* concrete */ class MyControl: Control
{
    public override bool getVisible(...) { ... }
    public override void setVisible(...) { ... }
} // class MyControl: Control

I know this suggestion is more "verbose" or complex, but, its very useful in real world.

[2] Add an "IsEnabled" property, do not confuse with "IsReadOnly":

abstract class Control
{
    // internal fields for properties
    protected bool _IsEnabled;

    public virtual bool getIsEnabled(...) { ... }
    public virtual void setIsEnabled(...) { ... }

    public bool IsEnabled{ get getIsEnabled(); set setIsEnabled(value); }
} // class Control

Means you may have to show your control, but don't show any information.

[3] Add a "IsReadOnly" property, do not confuse with "IsEnabled":

abstract class Control
{
    // internal fields for properties
    protected bool _IsReadOnly;

    public virtual bool getIsReadOnly(...) { ... }
    public virtual void setIsReadOnly(...) { ... }

    public bool IsReadOnly{ get getIsReadOnly(); set setIsReadOnly(value); }
} // class Control

That means the control can display information, but cannot be changed by the user.

C++ vs Java – Usage Rationale for Abstract Classes and Interfaces

OOP has composition and substitution.

C++ has multiple inheritance, template specialisation, embedding and value/move/pointer semantics.

Java has single inheritance and interfaces, embedding and reference semantics.

The common way the OOP school uses these languages is to employ inheritance for object substitution and embedding for composition. But you also need a common ancestor and a way to runtime-cast (in C++ is called dynamic_cast, in Java is just asking an interface from another).

Java does all this by its own java.lang.Object rooted hierachy. C++ doesn't have a predefined common root, so you should at least define it, to come to a same "picture" (but this is limiting some C++ possibilities...).

After that, the possibility to have compile-time polymorphism (think to CRTP) and value semantic can offer also other alternatives to the way the concept of "OOP object" can be ported into a C++ program.

You can even imagine the heresy to use embedding and implicit conversion to manage substitution and private inheritance to manage composition, in fact inverting the traditional school paradigm. (Of course, this way is 20 years younger than the other, so don't expect a wide community support in doing that)

Or you can imagine a virtual common base to all classes, form interface (no implementation) to final classes (fully implemented) going through partially implemented interfaces an even interface clusters, using "dominance" as dispatching from interface to implementations through a "multi stacked-parallelogram" inheritance scheme.

Comparing OOP to java to C++ assuming there is just one and only OOP way is limiting the capabilities of both the languages.

Forcing C++ to strictly adhere to Java coding idioms is denaturating C++ as forcing Java to behave as a C++-like language is denaturating Java.

Is not a matter of "sensibility" but of different "aggregation mechanisms" the two languages have and different way to combine them that makes some idiom more profitable in one language than the other and vice versa.

Best Answer

Related Solutions

C# Design Patterns – Using Abstract Base Class with Interfaces as Behaviors

C++ vs Java – Usage Rationale for Abstract Classes and Interfaces

Related Topic