C# – Does using “new” on a struct allocate it on the heap or stack

cmemory-managementnet

When you create an instance of a class with the new operator, memory gets allocated on the heap. When you create an instance of a struct with the new operator where does the memory get allocated, on the heap or on the stack ?

Best Answer

Okay, let's see if I can make this any clearer.

Firstly, Ash is right: the question is not about where value type variables are allocated. That's a different question - and one to which the answer isn't just "on the stack". It's more complicated than that (and made even more complicated by C# 2). I have an article on the topic and will expand on it if requested, but let's deal with just the new operator.

Secondly, all of this really depends on what level you're talking about. I'm looking at what the compiler does with the source code, in terms of the IL it creates. It's more than possible that the JIT compiler will do clever things in terms of optimising away quite a lot of "logical" allocation.

Thirdly, I'm ignoring generics, mostly because I don't actually know the answer, and partly because it would complicate things too much.

Finally, all of this is just with the current implementation. The C# spec doesn't specify much of this - it's effectively an implementation detail. There are those who believe that managed code developers really shouldn't care. I'm not sure I'd go that far, but it's worth imagining a world where in fact all local variables live on the heap - which would still conform with the spec.

There are two different situations with the new operator on value types: you can either call a parameterless constructor (e.g. new Guid()) or a parameterful constructor (e.g. new Guid(someString)). These generate significantly different IL. To understand why, you need to compare the C# and CLI specs: according to C#, all value types have a parameterless constructor. According to the CLI spec, no value types have parameterless constructors. (Fetch the constructors of a value type with reflection some time - you won't find a parameterless one.)

It makes sense for C# to treat the "initialize a value with zeroes" as a constructor, because it keeps the language consistent - you can think of new(...) as always calling a constructor. It makes sense for the CLI to think of it differently, as there's no real code to call - and certainly no type-specific code.

It also makes a difference what you're going to do with the value after you've initialized it. The IL used for

Guid localVariable = new Guid(someString);

is different to the IL used for:

myInstanceOrStaticVariable = new Guid(someString);

In addition, if the value is used as an intermediate value, e.g. an argument to a method call, things are slightly different again. To show all these differences, here's a short test program. It doesn't show the difference between static variables and instance variables: the IL would differ between stfld and stsfld, but that's all.

using System;

public class Test
{
    static Guid field;

    static void Main() {}
    static void MethodTakingGuid(Guid guid) {}


    static void ParameterisedCtorAssignToField()
    {
        field = new Guid("");
    }

    static void ParameterisedCtorAssignToLocal()
    {
        Guid local = new Guid("");
        // Force the value to be used
        local.ToString();
    }

    static void ParameterisedCtorCallMethod()
    {
        MethodTakingGuid(new Guid(""));
    }

    static void ParameterlessCtorAssignToField()
    {
        field = new Guid();
    }

    static void ParameterlessCtorAssignToLocal()
    {
        Guid local = new Guid();
        // Force the value to be used
        local.ToString();
    }

    static void ParameterlessCtorCallMethod()
    {
        MethodTakingGuid(new Guid());
    }
}

Here's the IL for the class, excluding irrelevant bits (such as nops):

.class public auto ansi beforefieldinit Test extends [mscorlib]System.Object    
{
    // Removed Test's constructor, Main, and MethodTakingGuid.

    .method private hidebysig static void ParameterisedCtorAssignToField() cil managed
    {
        .maxstack 8
        L_0001: ldstr ""
        L_0006: newobj instance void [mscorlib]System.Guid::.ctor(string)
        L_000b: stsfld valuetype [mscorlib]System.Guid Test::field
        L_0010: ret     
    }

    .method private hidebysig static void ParameterisedCtorAssignToLocal() cil managed
    {
        .maxstack 2
        .locals init ([0] valuetype [mscorlib]System.Guid guid)    
        L_0001: ldloca.s guid    
        L_0003: ldstr ""    
        L_0008: call instance void [mscorlib]System.Guid::.ctor(string)    
        // Removed ToString() call
        L_001c: ret
    }

    .method private hidebysig static void ParameterisedCtorCallMethod() cil  managed    
    {   
        .maxstack 8
        L_0001: ldstr ""
        L_0006: newobj instance void [mscorlib]System.Guid::.ctor(string)
        L_000b: call void Test::MethodTakingGuid(valuetype [mscorlib]System.Guid)
        L_0011: ret     
    }

    .method private hidebysig static void ParameterlessCtorAssignToField() cil managed
    {
        .maxstack 8
        L_0001: ldsflda valuetype [mscorlib]System.Guid Test::field
        L_0006: initobj [mscorlib]System.Guid
        L_000c: ret 
    }

    .method private hidebysig static void ParameterlessCtorAssignToLocal() cil managed
    {
        .maxstack 1
        .locals init ([0] valuetype [mscorlib]System.Guid guid)
        L_0001: ldloca.s guid
        L_0003: initobj [mscorlib]System.Guid
        // Removed ToString() call
        L_0017: ret 
    }

    .method private hidebysig static void ParameterlessCtorCallMethod() cil managed
    {
        .maxstack 1
        .locals init ([0] valuetype [mscorlib]System.Guid guid)    
        L_0001: ldloca.s guid
        L_0003: initobj [mscorlib]System.Guid
        L_0009: ldloc.0 
        L_000a: call void Test::MethodTakingGuid(valuetype [mscorlib]System.Guid)
        L_0010: ret 
    }

    .field private static valuetype [mscorlib]System.Guid field
}

As you can see, there are lots of different instructions used for calling the constructor:

newobj: Allocates the value on the stack, calls a parameterised constructor. Used for intermediate values, e.g. for assignment to a field or use as a method argument.
call instance: Uses an already-allocated storage location (whether on the stack or not). This is used in the code above for assigning to a local variable. If the same local variable is assigned a value several times using several new calls, it just initializes the data over the top of the old value - it doesn't allocate more stack space each time.
initobj: Uses an already-allocated storage location and just wipes the data. This is used for all our parameterless constructor calls, including those which assign to a local variable. For the method call, an intermediate local variable is effectively introduced, and its value wiped by initobj.

I hope this shows how complicated the topic is, while shining a bit of light on it at the same time. In some conceptual senses, every call to new allocates space on the stack - but as we've seen, that isn't what really happens even at the IL level. I'd like to highlight one particular case. Take this method:

void HowManyStackAllocations()
{
    Guid guid = new Guid();
    // [...] Use guid
    guid = new Guid(someBytes);
    // [...] Use guid
    guid = new Guid(someString);
    // [...] Use guid
}

That "logically" has 4 stack allocations - one for the variable, and one for each of the three new calls - but in fact (for that specific code) the stack is only allocated once, and then the same storage location is reused.

EDIT: Just to be clear, this is only true in some cases... in particular, the value of guid won't be visible if the Guid constructor throws an exception, which is why the C# compiler is able to reuse the same stack slot. See Eric Lippert's blog post on value type construction for more details and a case where it doesn't apply.

I've learned a lot in writing this answer - please ask for clarification if any of it is unclear!

Related Solutions

C# – How to create a new object instance from a Type

The Activator class within the root System namespace is pretty powerful.

There are a lot of overloads for passing parameters to the constructor and such. Check out the documentation at:

http://msdn.microsoft.com/en-us/library/system.activator.createinstance.aspx

or (new path)

https://docs.microsoft.com/en-us/dotnet/api/system.activator.createinstance

Here are some simple examples:

ObjectType instance = (ObjectType)Activator.CreateInstance(objectType);

ObjectType instance = (ObjectType)Activator.CreateInstance("MyAssembly","MyNamespace.ObjectType");

What and where are the stack and heap

The stack is the memory set aside as scratch space for a thread of execution. When a function is called, a block is reserved on the top of the stack for local variables and some bookkeeping data. When that function returns, the block becomes unused and can be used the next time a function is called. The stack is always reserved in a LIFO (last in first out) order; the most recently reserved block is always the next block to be freed. This makes it really simple to keep track of the stack; freeing a block from the stack is nothing more than adjusting one pointer.

The heap is memory set aside for dynamic allocation. Unlike the stack, there's no enforced pattern to the allocation and deallocation of blocks from the heap; you can allocate a block at any time and free it at any time. This makes it much more complex to keep track of which parts of the heap are allocated or freed at any given time; there are many custom heap allocators available to tune heap performance for different usage patterns.

Each thread gets a stack, while there's typically only one heap for the application (although it isn't uncommon to have multiple heaps for different types of allocation).

To answer your questions directly:

To what extent are they controlled by the OS or language runtime?

The OS allocates the stack for each system-level thread when the thread is created. Typically the OS is called by the language runtime to allocate the heap for the application.

What is their scope?

The stack is attached to a thread, so when the thread exits the stack is reclaimed. The heap is typically allocated at application startup by the runtime, and is reclaimed when the application (technically process) exits.

What determines the size of each of them?

The size of the stack is set when a thread is created. The size of the heap is set on application startup, but can grow as space is needed (the allocator requests more memory from the operating system).

What makes one faster?

The stack is faster because the access pattern makes it trivial to allocate and deallocate memory from it (a pointer/integer is simply incremented or decremented), while the heap has much more complex bookkeeping involved in an allocation or deallocation. Also, each byte in the stack tends to be reused very frequently which means it tends to be mapped to the processor's cache, making it very fast. Another performance hit for the heap is that the heap, being mostly a global resource, typically has to be multi-threading safe, i.e. each allocation and deallocation needs to be - typically - synchronized with "all" other heap accesses in the program.

A clear demonstration:
_{Image source: vikashazrati.wordpress.com}

Best Answer

Related Solutions

C# – How to create a new object instance from a Type

What and where are the stack and heap

Related Topic