Hope you can help me with the following problem.
We are running a CrushFTP service on a CentOS release 6.6 (Final) system. But nearly every week the service crashes.
So I take a look at the logs and found this lines
cat /var/log/messages
Jun 28 05:06:23 crushftp kernel: Out of memory: Kill process 1491 (java) score 883 or sacrifice child
Jun 28 05:06:23 crushftp kernel: Killed process 1491, UID 0, (java) total-vm:9620220kB, anon-rss:3245824kB, file-rss:128kB
CrushFTP is java and the only service we are running on the machine. The log looks like the system is killing the process.
But I don't understand why. So I searched a bit found this setting
cat /proc/sys/vm/overcommit_memory
0
When I understand it correct, the value must be ok and if the process needs more RAM it should be able to get it.
When I do a "top" the java process is the process with the highest usage of RAM.
top - 11:13:58 up 1 day, 4 min, 1 user, load average: 0.93, 0.94, 0.91
Tasks: 97 total, 1 running, 96 sleeping, 0 stopped, 0 zombie
Cpu(s): 11.2%us, 19.7%sy, 0.0%ni, 68.6%id, 0.0%wa, 0.0%hi, 0.5%si, 0.0%st
Mem: 3924136k total, 2736996k used, 1187140k free, 149380k buffers
Swap: 4128764k total, 0k used, 4128764k free, 814480k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1486 root 20 0 3633m 1.5g 13m S 20.3 39.8 191:24.36 java
The RAM is about 4GB and the SWAP file is the same size.
[root@atcrushftp ~]# cat /proc/meminfo
MemTotal: 3924136 kB
MemFree: 1159964 kB
Buffers: 149400 kB
Cached: 814476 kB
SwapCached: 0 kB
Active: 1956028 kB
Inactive: 619664 kB
Active(anon): 1611452 kB
Inactive(anon): 528 kB
Active(file): 344576 kB
Inactive(file): 619136 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 4128764 kB
SwapFree: 4128764 kB
Dirty: 36 kB
Writeback: 4 kB
AnonPages: 1597696 kB
Mapped: 34108 kB
Shmem: 164 kB
Slab: 136024 kB
SReclaimable: 74432 kB
SUnreclaim: 61592 kB
KernelStack: 1384 kB
PageTables: 5948 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 6090832 kB
Committed_AS: 746432 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 285216 kB
VmallocChunk: 34359441520 kB
HardwareCorrupted: 0 kB
AnonHugePages: 1501184 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 18432 kB
DirectMap2M: 4175872 kB
I asked the support but they are saying, that it's not a fault by CrushFTP and the system is getting out of memory.
Now my question is how can I found out which process is taking all the last free memory?
Best Answer
It's been a while since I had to read an OOM-killer log, but as I recall, this
means that java was using 9GB of VM when the OOM-killer shot it in the head. Given that you have 4GB of core, and 4GB of swap, that seems like a reasonable thing to do. You then write
which I don't understand.
Firstly, setting that value to
0
doesn't turn off overcommitment. As Red Hat write, setting this to0
requires that theSetting it to
2
does what you seem to want:But even turning off overcommit doesn't guarantee that a process can always get more RAM: only infinite VM guarantees that. As long as core+swap is finite, it can be used up - and if you have a process that's comsumed all the free VM at the moment the kernel needs a bit more, then the OOM-killer will wake up, and, well, that looks like what happened.
My recommendations are:
Don't run
java
as root. Ideally, don't run it at all, but if you must, not as root; that gives it a weighting in the OOM-killers eyes which may result in something important getting killed instead.Find the memory leak in whatever's using java.
If you really believe you don't have a memory leak, then you don't have enough core; pony up for a bigger server. Give it more swap, as well.
Monitor your java's VM footprint better; shoot it in the head if it gets all swollen.