R – .Net 3.5 Windows Forms Application: x86 vs x64 load times on 64 bit Vista

64-bitnetwindowswindows-vistawinforms

We are developing a Winforms application and in the process of optimizing the start-up time.

The app runs on 64 bit Vista machines. In our testing we found what seems like a counter intuitive result. All else equal, targeting 32-bit vs 64-bit loads in half the time. Can anyone shed some light as to why?

Thanks.

[Edit]
We deploy the app via ClickOnce which, from our research starts apps in a unique sandbox. Therefore it always cold-starts so looking to improve performance here was fruitless.

Our main problem was the existence of 32-bit dlls in the project. Once we targeted the project at x86 (even though it runs on x64) the load times were cut in half.
[/Edit]

Best Answer

.NET 3.5 SP1 gets its improved startup perf by no longer verifying the strong name of assemblies that come from trusted locations. A bit controversial in my book but somewhat defensible.

I did check if the 64-bit version of the CLR also bypasses that time-consuming step. Signed a DLL, put it in the GAC, then patched a byte. No complaints when loading the assembly. So it is not the SP1 startup pref improvement that explains the difference.

Other factors in the startup time are: - Loading the CLR from disk (coldstart only) - Groveling for the dependent assemblies - JIT compiling the startup code

Coldstart could well be a factor, you probably don't have other processes running that have the 64-bit version of the CLR loaded. Easy to eliminate by running a dummy .NET app while you do the test.

Groveling assemblies could take longer for the same reason. It is unlikely that the 64-bit ngen-ed images of the .NET assemblies are in the file system cache. Again, easy to eliminate with the dummy app having a dependency on the same assemblies.

The 64-bit JITter is a tougher nut to crack. An arbitrary call is to assume that MSFT didn't spend as much time making that one performant as the 32-bit JITter. Nothing backed-up by any evidence though. Difficult to measure too, you'd have load an assembly with Assembly.Load, then time Activator.CreateInstance() where the class constructor calls as much code as possible.