MongoDB getting OOM killed

mongodboom-killer

we are running a mongodb replicaset on three machines. All three machines have around 16GB but only 255MB Swap. Swappiness is left on it's default value 60. The machines are running CentOS 6.4. The databases are much larger than the 16GB, but that's ok for us. The really working set is much smaller.

The problem we are facing is that the primary consumes eats up all available memory and than getting OOM-Killed. I know that this is the way how mongodb manages memory.

After the server is getting OOM killed someone has to manually restart it.

Is there any way to prevent mongodb from getting OOM killed? Adjust the swappiness? Increase swap space? I think that those settings will only increase the grace period before mongod gets killed.

Best Answer

OOM killer is not a way anyone manages memory; it is Linux kernels way to handle fatal failure in last hope to avoid system lockup!

What you should do is:

make sure you have enough swap. If you are sure, still add more.
implement resource limits! At LEAST for applications you expect that will use memory (and even more so if you don't expect them to - those ones usually end up being problematic). See ulimit -v (or limit addressspace) commands in your shell and put it before application startup in its init script. You should also limit other stuff (like number of processes -u, etc)... That way, application will get ENOMEM error when there is not enough memory, instead of kernel giving them non-existent memory and afterwards going berserk killing everything around!
tell the kernel not to overcommit memory. You could do:

echo "0" > /proc/sys/vm/overcommit_memory

or even better (depending on your amount of swap space)

echo "2" > /proc/sys/vm/overcommit_memory; echo "80" > /proc/sys/vm/overcommit_ratio

See Turning off overcommit for more info on that.

That would instruct kernel to be more carefull when giving applications memory it doesn't really have (similarity to worlds global economic crisis is striking)
as a last dich resort, if everything on your system except MangoDB is expendable (but please fix two points above first!) you can make lower the chances of it being killed (or even making sure it won't be killed - even if alternative is hangup machine with nothing working) by tuning /proc/$pid/oom_score_adj and/or /proc/$pid/oom_score.

echo "-1000" > /proc/`pidof mangod`/oom_score_adj

See Taming the OOM killer for more info on that subject.

Related Solutions

Python – EC2 servers are slower than local machine

There are a number of factors for EC2 machines to be slow. Disks are not attached directly to the instance. Instead the ebs volumes are large network disks and whatever you write to them is sent across the network to these disks. Now usually the latency is quite low but, of course, in comparison to something which is directly attached to your machine it will appear slow.

It is a virtual machine. No matter what you do it has to compete with other machines for CPU cycles. Run top if you are using Linux and check out CPU steal percentage. A non zero number will indicate that there is high competition for CPU. In any case virtual CPUs are not as fast as actual CPUs for comparable Processors.

Another personal observation is that luck plays an important role in EC2 (yes!). At times you get an older hardware which is just not as fast. Another personal experience is that at times you get amd opteron processors which are usually not as fast as Intel based. I am not suggesting that AMD processors are bad but it seems that in this case Intel ones work faster. Maybe they are of newer generation.

Having maintained mongo on EC2, I totally understand your pain. I would suggest that try to keep as much data in-memory as possible. In general, EC2 is not actually designed for vertical scaling. It is beneficial to have a lot of smaller instance dividing work then have a huge instance doing everything alone.

Linux – Troubleshooting oom-killer using atop: Is it fixed or not

Looking at your OOM-killer output, your system is certainly not having 500 MB free RAM and empty swap:

Jun 15 10:21:26 mail kernel: [142707.434172] Node 0 DMA free:11692kB min:20kB low:24kB high:28kB active:0kB inactive:0kB present:10772kB pages_scanned:0 all_unreclaimable? yes
Jun 15 10:21:26 mail kernel: [142707.434177] Node 0 DMA32 free:10396kB min:6056kB low:7568kB high:9084kB active:152kB inactive:2812380kB present:3072160kB pages_scanned:64 all_unreclaimable? no
Jun 15 10:21:26 mail kernel: [142707.434182] Node 0 Normal free:1600kB min:2036kB low:2544kB high:3052kB active:475840kB inactive:462064kB present:1034240kB pages_scanned:148770 all_unreclaimable? no
[...]
Jun 15 10:21:26 mail kernel: [142707.434205] Free swap  = 0kB
Jun 15 10:21:26 mail kernel: [142707.434206] Total swap = 2097144kB

Note that the free memory in the "Normal" zone is below the "min" limit meaning that userland processes cannot allocate memory from it anymore:

Your DMA and DMA32 zones do have some memory available, but the OOM-killer is triggered because the request for memory came for the "HIGHMEM" (or "normal") zone (gfp_mask lower nibble is 2h)

It is quite possible that the memory usage is spiking fast enough to fit into the time interval between two queries of your monitoring system, thus you would not be able to see a spike - the system just becomes unusable.

Disabling overcommit by setting vm.overcommit_memory = 2 and/or vm.overcommit_ratio only will help matters in terms that you would not get OOM invocations any more. But the memory shortage will persist and processes asking for memory allocation upon a "memory full" condition might simply terminate exceptionally.

To really get at the situation, find out what is consuming all your memory - Apache workers are likely candidates, try enabling vm.oom_dump_tasks to get more information from oom_killer on processes and memory usage at the time of the killing decision. Also take a look at this question which your depiction resembles quite a bit.

Best Answer

Related Solutions

Python – EC2 servers are slower than local machine

Linux – Troubleshooting oom-killer using atop: Is it fixed or not

Related Topic