Architecture – Why does the .NET framework have no concept of classes as first-class types

Architecturedelphihistorynet

It's well known to those familiar with the history that C# and the .NET framework started out as essentially "Delphi rewritten to feel like Java," architected by the chief developer behind Delphi, Anders Hejlsberg. Things have diverged quite a bit since then, but early on the similarities were so obvious that there was even some serious speculation that .NET was actually originally Borland's product.

But I've been looking at some .NET stuff lately, and one of the most interesting and useful features from Delphi seems to be missing entirely: the concept of classes as a first-class data type. For those not familiar with it, the type TClass represents a reference to a class, similar to the Type type in .NET. But where .NET uses Type for reflection, Delphi uses TClass as a very important built-in part of the language. It allows for various useful idioms that simply don't and can't exist without it, such as class subtype variables and virtual class methods.

Every OO language has virtual methods, in which different classes implement the same fundamental concept of a method in different ways, and then the right method gets called at runtime based on the actual type of the object instance it's called on. Delphi extends this concept to classes: if you have a TClass reference defined as a specific class subtype (ie class of TMyClass means that the variable can accept any class reference that inherits from TMyClass, but not anything outside the heirarchy) that has class-scope virtual methods attached to it, they can be called without an instance by using the actual type of the class. Applying this pattern to constructors makes a Factory implementation trivial, for example.

There doesn't seem to be anything equivalent in .NET. With as useful as class references (and especially virtual constructors and other virtual class methods!) are, has anyone said anything about why they were left out?

Specific Examples

Form Deserialization

The Delphi VCL saves forms in DFM format, a DSL for describing a component hierarchy. When the form reader parses DFM data, it runs across objects that are described like this:

object Name: ClassName
   property = value
   property = value
   ...
   object SubObjectName: ClassName
      ...
   end
end

The interesting thing here is the ClassName part. Each component class registers its TClass with the component streaming system at initialization time (think static constructors, only slightly different, guaranteed to happen immediately on startup.) This registers each class in a string->TClass hashmap with the class name as the key.

Each component descends from TComponent, which has a virtual constructor that takes a single argument, Owner: TComponent. Any component can override this constructor to provide for its own initialization. When the DFM reader reads a class name, it looks up the name in the aforementioned hashmap and retrieves the corresponding class reference (or raises an exception if it's not there), then calls the virtual TComponent constructor on it, which is known to be good because the registration function takes a class reference that is required to descend from TComponent, and you end up with an object of the proper type.

Lacking this, the WinForms equivalent is… well… a big mess to put it bluntly, requiring any new .NET language to completely re-implement its own form (de)serialization. This is a bit shocking when you think about it; since the whole point of having a CLR is to let multiple languages use the same basic infrastructure, a DFM-style system would have made perfect sense.

Extensibility

An image manager class I wrote can be provided with a data source (such as a path to your image files) and then load new image objects automatically if you attempt to retrieve a name that's not in the collection but is available in the data source. It has a class variable typed as class of the base image class, representing the class of any new objects to be created. It comes with a default, but there are some points, when creating new images with special purposes, that the images should be set up in different ways. (Creating it without an alpha channel, retrieving special metadata from a PNG file to specify sprite size, etc.)

This could be done by writing extensive amounts of configuration code and passing in special options to all of the methods that might end up creating a new object… or you could just make a subclass of the base image class that overrides a virtual method where the aspect in question gets configured, and then use a try/finally block to temporarily replace the "default class" property as needed and then restore it. Doing it with class reference variables is far simpler, and is not something that could be done with generics instead.

Best Answer

.NET (the CLR) was the third generation of Microsoft's Component Object Model (COM), which in the early days referred to as the "COM+ runtime". Microsoft Visual Basic and the COM/ActiveX controls market has had much more influence on the specific CLR architectural compatability choices than Borland Delphi. (Admittedly the fact that that Delphi had adopted ActiveX controls certainly helped grow the ActiveX ecosystem, but COM/ActiveX existed before Delphi)

COM's architecture was worked out in C (not C++) and focused on interfaces, rather than classes. Furthermore supported object composition, meaning that a COM object could in fact be made up of several different objects, with the IUnknown interface linking them together. But IUnknown had no role in COM object creation, which was designed to be as language-independent as possible. Object creation was generally handled by IClassFactory, and reflection by ITypeLibrary and related interfaces. This separation of concerns was independent of implementation language, and the features of each core COM interface were kept minimal and orthogonal.

So as a result of the popularity of COM and ActiveX controls, .NET architecture was built to support on the COM IUnknown, IClassFactory, and ITypeLibrary. In COM, these interfaces were not necessarily on the same object, so lumping these together didn't necessarily make sense.