Proper shutdown of a kubernetes cluster

kubernetes

Imagine the following scenario:

You run a kubernetes cluster in your datacenter, which was deployed with kubeadm.
It consists of one masternode (running etcd as a static pod, as deployed by kubeadm) and 3 worker nodes
the nodes as virtual machines running on vmware

Today, you open your e-mail and you are notified the datacenter will move to a new location. The physical servers will be turned off, moved to the new location and powered on again.

What is the correct shutdown procedure for your kubernetes cluster (without messing up your etcd data)?

This what I did:

stopped the master server first (this includes etcd ofc), to prevent pods from being rescheduled to other nodes when I turn off the worker nodes.
stopped each worker node

After the migration:

powered on the worker nodes first
powered on the master node next

After doing this, I ended up with one of two scenarios:

etcd data is corrupt and the etcd pod exits with an error
getting errors like this: "Operation cannot be fulfilled on nodes "worker-002": the object has been modified; please apply your changes to the latest version and try again". my logs are getting flooded with these messages.

How could this have been prevented? I don't think running etcd in HA mode would help here, as all etcd nodes would have to be shut down at once too, so you end up with a similar situation as a single node scenario. I get the impression that Etcd is quite… fragile, compared to other K/V stores like Consul.

Best Answer

You will need to stop on master

kupe-apiserver
kube-scheduler
kube-controller
kubelet(if applicable)
kube-proxy(if applicable)

If you have federation also stop federation-apiserver

Run a backup(snapshot) of etcd and stop etcd when done

For each node stop

kubelet
kube-proxy

Etcd is as robust as consul, what do you mean by instable ?!

When restore though you have the etcd data, this is not valid immediately ... you should read on backups on kubernetes

docker version

In my first tries, I used docker.io from the default Ubuntu repositories (17.12.1-ce). In the tutorial https://computingforgeeks.com/how-to-setup-3-node-kubernetes-cluster-on-ubuntu-18-04-with-weave-net-cni/, I discovered they recommend something different:

apt-get --purge remove docker docker-engine docker.io
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
apt-get update
apt-get install docker-ce

This is now version 18.6.1, and also doesn't cause a warning anymore in kubeadm preflight check.

cleanup

I used kubeadm reset and deleting some directories when resetting my VMs to an unconfigured state. After I read some bug reports, I decided to extend the list of directories to remove. This is what I do now:

kubeadm reset
rm -rf /var/lib/cni/ /var/lib/calico/ /var/lib/kubelet/ /var/lib/etcd/ /etc/kubernetes/ /etc/cni/
reboot

Calico setup

With the above changes, I was immediately able to init a full-working setup (all pods "Running" and curl working). I did "Variant with extra etcd".

All this worked until the first reboot, then I had again the

calico-kube-controllers-f4dcbf48b-qrqnc CreateContainerConfigError

Digging into this problem showed me.

$ kubectl -n kube-system describe pod/calico-kube-controllers-f4dcbf48b-dp6n9
Events:
  Type     Reason            Age                     From               Message
  ----     ------            ----                    ----               -------
  Warning  Failed            4m32s (x10 over 9m)     kubelet, node1     Error: Couldn't find key etcd_endpoints in ConfigMap kube-system/calico-config

Then, I realized that I did two installation instructions in chain which were meant to do only one.

kubectl apply -f https://docs.projectcalico.org/v3.2/getting-started/kubernetes/installation/hosted/rbac-kdd.yaml

curl https://docs.projectcalico.org/v3.2/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml -O

cp -p calico.yaml calico.yaml_orig
sed -i 's/192.168.0.0/10.10.0.0/' calico.yaml

kubectl apply -f calico.yaml

Result

$ kubectl get pod,svc,nodes --all-namespaces -owide

NAMESPACE     NAME                                        READY   STATUS    RESTARTS   AGE   IP              NODE      NOMINATED NODE
default       pod/www1                                    1/1     Running   2          71m   10.10.3.4       node1     <none>
default       pod/www2                                    1/1     Running   2          71m   10.10.4.4       node2     <none>
kube-system   pod/calico-node-45sjp                       2/2     Running   4          74m   192.168.1.213   node1     <none>
kube-system   pod/calico-node-bprml                       2/2     Running   4          74m   192.168.1.211   master1   <none>
kube-system   pod/calico-node-hqdsd                       2/2     Running   4          74m   192.168.1.212   master2   <none>
kube-system   pod/calico-node-p8fgq                       2/2     Running   4          74m   192.168.1.214   node2     <none>
kube-system   pod/coredns-576cbf47c7-f2l7l                1/1     Running   2          84m   10.10.2.7       master2   <none>
kube-system   pod/coredns-576cbf47c7-frq5x                1/1     Running   2          84m   10.10.2.6       master2   <none>
kube-system   pod/etcd-master1                            1/1     Running   2          83m   192.168.1.211   master1   <none>
kube-system   pod/kube-apiserver-master1                  1/1     Running   2          83m   192.168.1.211   master1   <none>
kube-system   pod/kube-controller-manager-master1         1/1     Running   2          83m   192.168.1.211   master1   <none>
kube-system   pod/kube-proxy-9jmsk                        1/1     Running   2          80m   192.168.1.213   node1     <none>
kube-system   pod/kube-proxy-gtzvz                        1/1     Running   2          80m   192.168.1.214   node2     <none>
kube-system   pod/kube-proxy-str87                        1/1     Running   2          84m   192.168.1.211   master1   <none>
kube-system   pod/kube-proxy-tps6d                        1/1     Running   2          80m   192.168.1.212   master2   <none>
kube-system   pod/kube-scheduler-master1                  1/1     Running   2          83m   192.168.1.211   master1   <none>
kube-system   pod/kubernetes-dashboard-77fd78f978-9vdqz   1/1     Running   0          24m   10.10.3.5       node1     <none>

NAMESPACE     NAME                           TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE   SELECTOR
default       service/kubernetes             ClusterIP   10.96.0.1        <none>        443/TCP          84m   <none>
default       service/www-np                 NodePort    10.107.205.119   <none>        8080:30333/TCP   71m   service=testwww
kube-system   service/calico-typha           ClusterIP   10.99.187.161    <none>        5473/TCP         74m   k8s-app=calico-typha
kube-system   service/kube-dns               ClusterIP   10.96.0.10       <none>        53/UDP,53/TCP    84m   k8s-app=kube-dns
kube-system   service/kubernetes-dashboard   ClusterIP   10.96.168.213    <none>        443/TCP          24m   k8s-app=kubernetes-dashboard

NAMESPACE   NAME           STATUS   ROLES    AGE   VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE           KERNEL-VERSION      CONTAINER-RUNTIME
            node/master1   Ready    master   84m   v1.12.1   192.168.1.211   <none>        Ubuntu 18.04 LTS   4.15.0-20-generic   docker://18.6.1
            node/master2   Ready    <none>   80m   v1.12.1   192.168.1.212   <none>        Ubuntu 18.04 LTS   4.15.0-20-generic   docker://18.6.1
            node/node1     Ready    <none>   80m   v1.12.1   192.168.1.213   <none>        Ubuntu 18.04 LTS   4.15.0-20-generic   docker://18.6.1
            node/node2     Ready    <none>   80m   v1.12.1   192.168.1.214   <none>        Ubuntu 18.04 LTS   4.15.0-20-generic   docker://18.6.1


192.168.1.211 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
192.168.1.212 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
192.168.1.213 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
192.168.1.214 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Best Answer

Related Solutions

How to Change Timeout Values in a Kubernetes Cluster

Node-to-Node communication doesn’t work with Kubernetes with Calico

docker version

cleanup

Calico setup

Result

Related Topic