How to calculate ulimit -n (file descriptors) for a dedicated squid server

max-file-descriptorssquid

I have a production squid server that was having some issues serving content and reporting that it was out of file descriptors. I was successfully able to increase it from 1024 (default) to 4096 and it seemed to resolve my errors in the log. I was still seeing response code 0 and 0 bytes received for some calls that were not cached and this leads me to believe that in a peak volume (boot storm) that my file descriptor count is still too low.

I have read some posts already and the setting can be set high to something like 24k, 40k, or even 70k. With this being a dedicated squid box I am not worried about other processes/users competing for file descriptors system-wide, but I'd really like to know what the best practice is for doing a rough calculation of how many file descriptors i should configure for ulimit -n.

In my configuration, I have a maximum of 3000 client-side TCP connections, a maximum of 3000 server-side TCP connections, and a few log files that are configured by default in the squid config (cache.log, squid.log). Is it as simple as saying that I should set my ulimit -n to 3000 + 3000 + 2 + (some overhead amount)? For a lack of documentation on the matter I'll probably set it to 24k just to never have to deal with it, but I prefer having a best practice formula to follow – just like with apache2 you can calculate memory needed for how many requests you want to be able to handle simultaneously.

Edit: Forgot to mention that I am not writing these cached files to disk, they are staying in memory. It's a few hundred files (<5 MB total) website that is the only page that gets loaded through this, so that's why I omitted the disk read/write file descriptors.

Best Answer

In the worst case scenario each request to squid server requires three file descriptors;

  1. A descriptor for the client-side connection
  2. Another for the server-side connection in case it is not cached.
  3. Third one for the file to read hit or cache the miss.

Then there are overheads including log files, any inter-process communication, e.g., helpers and idling connections. So as a rough estimate you need three file descriptors for each incoming TCP connection and then factor in any overheads to that.