Windows – IIS7.5 App Pool recycles – .Net OutOfMemoryException

asp.netiisiis-7.5windows

strange situation with .Net OutOfMemoryExceptions in IIS / Windows 2008 R2 being thrown on random pages of the application being hit.

We have about 1000 separate sites that are the same .Net application (different codebase folders and app pools per site).
64 bit Windows and running .Net 2.0, application is compiled with 'AnyCPU' flag.

Since the same exact code works on the old server and never throws outofmemoryexceptions, we are holding off on spending major time profiling the application and examining dumps and performing code optimizations that would help avoid large object heap fragmentations (so we are hoping to get some hints on possible server configratuion issues that may be the culprit more than looking in the code base and optimizing it…).

Config 1 – Rackspace CloudSites (shared hosting, we can only FTP to it, no access to IIS settings):

1 IIS server, we don't have control over managing it but were told each app pool has a 250MB recycle limit. Of our 1000 sites, many sites (20-50ish) apparently share the same app pool.
We never get OutOfMemoryExceptions here and have the application running on it since years.

Config 2 – Rackspace Dedicated Server (full control):

Monster server with 128GB RAM, dedicated, each site has it's own app pool. All app pools have the same settings (350MB recycle limit).
Not sure if this is of consequence but the page file size on this sever is 4GB (no idea what Config 1 has – does this need to be increased/addressed?).

Both configs are load balanced across 2 or 3 web servers, but that really doesn't matter here per se, since we are seing sites with NO TRAFFIC getting killed with OutOfMemoryExceptions.

Best Answer

This post escalated while I wrote it, here's the take-away bullets

TL;DR

  1. Increase the size of your page file (shooting from the hip I'd say to at least 40GB, much more if you can afford the disk capacity and I/O, but read the article at the bottom)
  2. Increase the frequentHitThreshold and frequentHitTimePeriod values(review the Web Service Cache performance counters and adjust accordingly)
  3. Lower the maxResponseSize value to 85KB or below to avoid cache entries in the large object heap
  4. Lower the Memory Limit for application pool Recycling, it doesn't make much sense
  5. Consider grouping applications with the same or similar codebase in application pools

Original Answer

Not sure if this is of consequence but the page file size on this sever is 4GB (no idea what Config 1 has - does this need to be increased/addressed?

This, here ^ right up here ^ look at it.

I'm willing to bet that this is exactly the reason why your applications throws a OutOfMemoryException in response to the seemingly most random and benign requests, but to understand why, let's get one thing clear:

OutOfMemory does not mean that your server is out of memory!

I know it sounds like a bad joke, but it really isn't. It's not the Operating System complaining about memory depletion - it's the process.
If that last statement made no sense to you, please read on.

Memory Management 101

When a process allocates memory from the OS, it comes in a sequence of fragments called pages, 4 kilobytes a piece, that the process can treat as its own (this is commonly referred to as a Virtual Address Space).

Since an object (such as a string, an XML document, an image or whatever you need to keep in memory) can exceed the pagesize of 4KB, the process will need to allocate multiple successive pages from this memory from time to time.

Over time however, the memory space becomes fragmented, even using the .NET CLR. The Garbage Collector will try and do what it can to help your application make better use of the address space by rearranging the pages in the working set during collections (this is practically the same as a disk defragmentation), but pointers to the large object heap for example, will be left untouched.

How IIS 7.x plays a role

As recently explained in this answer, IIS will also try to store as many cacheable output objects (such as static files up to 256KB) as it can, in the same process that serves your application - in addition to the suggestion in that answer, you can also try to tune the caching frequency thresholds using the <serverRuntime> configuration element.

In any case, IIS 7.5 - in its default configuration - cares deeply about allocating enough memory for its worker processes, and even with "NO TRAFFIC", it's not uncommon to see a worker process claim the first 100MB when it spins up, even with a slightly smaller application codebase on disk.

What does this have to do with the pagefile?

It doesn't take graduate level math to see that 100MB * 1000 processes is not that far from the 128GB of RAM the OS has to offer. Even though IIS tries to allocate as much memory for its worker processes as needed, it stops at some point to leave some room for the operating system at around 85% of total memory installed, regardless of how many mega- or gigabytes of RAM that might be (this is not a fact I've seen documentation for, but drawn from first-hand experience with a large range of IIS installation with different hardware specs).

At this point, the operating system can help out freeing memory by allocating pages from the pagefile instead - pages stored in a file on a physical disk. Since disk capacity is often plentyful, allocating large chunks of disk storage is not that big a deal, but if it needs to page memory for a 1000 processes and is only allowed to do so on 4GB of space, it doesn't take long before the processes will not be able to allocate longer sequences of non-fragmented memory and puff!: the process throws an OutOfMemoryException, it simply means that it was not able to find enough adjacent pages for something in its accessible virtual address space.

It doesn't even have to be large objects. They just have to be larger than the maximum number of contiguous pages you have available at runtime. Theoretically, you could get an OutOfMemory exception for trying to append a single character to a string that's currently more than 2KB in size.

What should the Pagefile size be set to then?

Microsofts answer to this question has always been: "it depends", but at least 1 x RAM + 257MB (this is the amount of storage the system requires to be able to write a complete memory dump).

A rule-of-thumb seems to be around 1.5-2 x RAM but again, it depends, and a number of articles has been published on how to determine the right minimum and maximum page file size on a given system. I've included the most relevant one in the bottom.

Be sure to monitor the disk containing the pagefile as well, if Disk Queue Length counters start spiking, you might want to move it to a dedicated disk, or spread it out over multiple disks.

How to determine the appropriate page file size for 64-bit versions of Windows