Linux – Why (or how) does the number of open file descriptors in use by root exceed ulimit -n

glassfishlinuxmax-file-descriptorsroot

Our server recently ran out of file descriptors, and in regards to that I have some questions. ulimit -n is supposed to give me the maximum number of open file descriptors. That number is 1024. I checked the number of open file descriptors by running lsof -u root |wc -l and got 2500 fds. That is a lot more than 1024, so I guessed that would mean the number 1024 is per process, not per user, as I though. Well, I ran lsof -p$PidOfGlassfish|wc -l and got 1300. This is the part I don't get. If ulimit -n is not the maximum number of processes per user or per process, then what is it good for? Does it not apply to the root user? And if so, how could I then get the error messages about running out of file descriptor?

EDIT: The only way I can make sense out of ulimit -n is if it applies the the number of open files (as stated in the bash manual) rather than the number of file handles (different processes can open the same file). If this is the case, then simply listing the number of open files (grepping on '/', thus excluding memory mapped files) is not sufficent:

lsof -u root |grep /|sort  -k9  |wc -l #prints '1738'

To actually see the number of open files, I would need to filter on the name column on only print the unique entries. Thus the following is probably more correct:

lsof -u root |grep /|sort  -k9 -u |wc -l #prints '604'

The command above expects output on the following format from lsof:

java      32008 root  mem       REG                8,2 11942368      72721 /usr/lib64/locale/locale-archive
vmtoolsd   4764 root  mem       REG                8,2    18624     106432 /usr/lib64/open-vm-tools/plugins/vmsvc/libguestInfo.so

This at least gives me number less than 1024 (the number reported by ulimit -n), so this seems like a step in the right direction. "Unfortunately" I am not experiencing any problems with running out of file descriptors, so I will have a hard time validating this.

Best Answer

I tested this in Linux version 2.6.18-164.el5 - Red Hat 4.1.2-46. I could see that the ulimit is applied per process.

The parameter is set at user level, but applied for each process.

Eg: 1024 was the limit. Multiple processes were started and the files open by each one was counted using

ls -l /proc/--$pid--/fd/ | wc -l

There were no errors when the sum of files opened by multiple processes crossed 1024. I also verified the unique file count combining the results for different processes and counting unique files. The errors started appearing only when the count for each process crossed 1024. ( java.net.SocketException: Too many open files in process logs )