We have an api in running which receives once a day multiple batches of large data that are inserted in a mongodb.
We use the cvallance/mongo-k8s-sidecar
for the replicationset configuration
This works perfectly on a local mongodatabase.
there is also no production traffic on the database which could raise raise conditions or so.
Now we deployed it to a google container engine. There the import works in general too.
But from time to time we got timeoutexceptions like this:
Cannot run replSetReconfig because the node is currently updating its configuration
or
MongoDB.Driver.MongoCommandException: Command insert failed: BSONObj
size: 16793637 (0x1004025) is invalid. Size must be between 0 and
16793600(16MB) First element: insert:
"LandingPageConnectionSet_Stage".
or
Error in workloop { MongoError: connection 0 to 127.0.0.1:27017 timed
out at Function.MongoError.create
(/opt/cvallance/mongo-k8s-sidecar/node_modules/mongodb-core/lib/error.js:29:11)
at Socket.
(/opt/cvallance/mongo-k8s-sidecar/node_modules/mongodb-core/lib/connection/connection.js:198:20)
at Object.onceWrapper (events.js:254:19) at Socket.emit
(events.js:159:13) at Socket._onTimeout (net.js:411:8) at ontimeout
(timers.js:478:11) at tryOnTimeout (timers.js:302:5) at
Timer.listOnTimeout (timers.js:262:5)
I can see that the cpu seems to not be at its limits.
Kubernetes configuration for mongodb
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: fast
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd
---
apiVersion: v1
kind: Service
metadata:
name: mongo
labels:
name: mongo
spec:
ports:
- port: 27017
targetPort: 27017
clusterIP: None
selector:
role: mongo
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: mongo
spec:
serviceName: "mongo"
replicas: 3
template:
metadata:
labels:
role: mongo
environment: test
spec:
terminationGracePeriodSeconds: 10
containers:
- name: mongo
image: mongo:3.6
command:
- mongod
- "--replSet"
- rs0
- "--bind_ip"
- 0.0.0.0
- "--smallfiles"
- "--noprealloc"
ports:
- containerPort: 27017
volumeMounts:
- name: mongo-persistent-storage
mountPath: /data/db
- name: mongo-sidecar
image: cvallance/mongo-k8s-sidecar
env:
- name: MONGO_SIDECAR_POD_LABELS
value: "role=mongo,environment=test"
volumeClaimTemplates:
- metadata:
name: mongo-persistent-storage
annotations:
volume.beta.kubernetes.io/storage-class: "fast"
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 32Gi
we also little changed the config by limitting the wiretiger cachesize and removing the smallfiles options so the part in the config looked like this:
- mongod
- "--replSet"
- rs0
- "--bind_ip"
- 0.0.0.0
- "--noprealloc"
- "--wiredTigerCacheSizeGB"
- "1.5"
Best Answer
In the Kubernetes dashboard regarding the status of the PODs there were the following hints:
You could have retrieved the very same information through
kubectl describe pod [podname]
Notice that quoting the documentation: "If the kubelet is unable to reclaim sufficient resources on the node, kubelet begins evicting Pods."
Therefore I believed that the error with Mongodb since it was working on premise without any issue, to doublecheck we went through the Kernel logs showed by the console serial output and we found:
We noticed as well that there was no Memory Request field in the YAML file of the deployment. This is an issue since it could happen that even if there are three nodes with no workload can happen that all the PODs are started on the very same node since they theoretically fit.
In order to mitigate this behaviour there are some possible solution:
Scale vertically the cluster and introduce memory request values
Instruct the mongodb process to consume an amount of memory smaller than the Requested one.
The introduction of memory limit is essential if you have more container running on the same node and you want to avoid that they are killed by it. Consider that in this way it will be killed sometimes even if there is still memory available on the node.