Stack vs Heap – Instance Methods Allocation

heapstack

Do methods (and their variables) that belong to an object instance go on the stack or the heap?

Ex

Main()
{
Myclass Myobj = new Myclass();
Myobj.Doit();
}

class Myclass
{
 Void Doit()
 {
 Int myint = 5;
 }
}

I use c# primarily, but I assume the answer is language agnostic.

Best Answer

The CLR standard does not require a stack or a heap, so lets get that out of the way first. But C# implemented on paper isn't very useful. I describe here the implementations we can run code with "in practice", like the Microsoft C# or Mono C#. Regardless, the method and local variables have a conceptual relationship with classes and object instances that you have to understand, because it isn't specific to C#, it is the same for computer languages in general.

Instance methods are stored in the same way that static methods are.

Methods are part of the code (in CLR, bytecode), they are compiled instructions, low level CLR opcodes that make up the assembly. So in the C# model they aren't part of a heap, the heap is for data. (See caveat footnote at bottom). In languages that aren't object oriented, there are plain functions. In those languages, the functions are just code that receives arguments. In CLR a plain function is a static method (or class method). The only difference in a function and a method is that syntactic sugar makes the function appear to be owned, but its object is really just the first "implicit" argument (this), and in most languages it comes before the method call, ie. obj.Foo(). The first argument to the function must still be loaded by an instruction; in CLR MSIL, this is loaded by ldarg.0 except in the case of a static method. So obj.Foo(arg) is equivalent to Foo(obj,arg).

When you compile a class, the compiler emits a set of instructions that represent the method and packs it into a code segment.

The relationship of the method to the object is actually similar to the relationship of class to object. The method is part of the class (ie part of the type, not part of the object. The method is akin to static data, but the "method data" happens to be code. Like static fields, the methods exist prior to and without any existence of an object instance and are just part of the object's type. I might have 1,000,000 instances of a string, but there is a single copy of string::Concat(string) somewhere.

As far as a method's local variables, they do not exist until the method is called. At the beginning of a CLR method, the call frame assures space for all local variables in that method. They are known ahead of time, but are simply formal notation until the method runs, then they become real data addresses. An instruction tells the CLR how much space to allocate. Local variable values are conceptually and practically on the stack, yet mapped to registers. The variables are "local" to the scope of the method, and go away when the method returns (though the objects they refer may not). There are instructions for dealing with them in CLR. Lets look at a rough sample of MSIL (IL assembly for CLR) for your DoIt() method.

.method void MyClass::Doit()
{
    .locals init([0]int myint)   // declares the locals for the method
    ldc.i4.5
    stloc 'myint'                // initialize myint to 0
    ret
}

Locals are aggressively mapped to real CPU registers by the JIT compiler so they mostly exist on a stack when CLR is interpreting or verifying bytecode. In practice, locals exist in registers, but spill over to a stack on the low level hardware.

Finally, for any computer language, there are 3 primary groups of syntax.

  1. Type declarations and definitions (formal ideas, compiler enforced, and metadata)
  2. Algorithmic (code, methods, statements, expressions)
  3. Data and variables (data)

Caveat: Runtime systems are commonly implemented in C or C++. A CLR assembly and its methods are loaded into the runtime heap of the host language used to write the CLR. But conceptually that is a different heap than the "heap" you are accessing within the CLR.