Docker – Kubernetes Pod OOMKilled Issue

dockerkubernetesmemoryoom

The scenario is we run some web sites based on an nginx image in kubernetes cluster. When we had our cluster setup with nodes of 2cores and 4GB RAM each. The pods had the following configurations, cpu: 40m and memory: 100MiB. Later, we upgraded our cluster with nodes of 4cores and 8GB RAM each. But kept on getting OOMKilled in every pod. So we increased memory on every pods to around 300MiB and then every thing seems to be working fine.

My question is why does this happen and how do I solve it. P.S. if we revert back to each node being 2cores and 4GB RAM, the pods work just fine with decreased resources of 100MiB.

Best Answer

First of all, the pod should not request more memory/CPU just because the node has gotten more resources. Without your specs it is hard to point out what might be wrong config wise but I would like to explain the concept to make it more clear.

You mentioned your pod configuration but did not specify if these are limits or requests.

  • Requests are what the container is guaranteed to get. If a container requests a resource, Kubernetes will only schedule it on a node that can give it that resource. These will not cause OOMs, they will cause pod not to get scheduled.

  • Limits, on the other hand, make sure a container never goes above a certain value. This can cause OOM kill.

Requests and limits are on a per-container basis. While Pods usually contain a single container, it’s common to see Pods with multiple containers as well. Each container in the Pod gets its own individual limit and request, but because Pods are always scheduled as a group, you need to add the limits and requests for each container together to get an aggregate value for the Pod.

Below is an example of a pod that has 2 containers with the same specified requests and limits:

apiVersion: v1
kind: Pod
metadata:
  name: frontend
spec:
  containers:
  - name: db
    image: mysql
    env:
    - name: MYSQL_ROOT_PASSWORD
      value: "password"
    resources:
      requests:
        memory: "64Mi"
        cpu: "250m"
      limits:
        memory: "128Mi"
        cpu: "500m"
  - name: wp
    image: wordpress
    resources:
      requests:
        memory: "64Mi"
        cpu: "250m"
      limits:
        memory: "128Mi"
        cpu: "500m"

If you want to calculate requests and limits for the whole pod you need to sum those values, making it: request of 0.5 cpu and 128 MiB of memory, and a limit of 1 cpu and 256MiB of memory.

If you want to find more regarding that topic than check out the official documentation:

Please, let me now if that helped.

Related Topic