Kubernetes – How to Fix CoreDNS Pods Stuck in Pending State

kubeadmkubernetesopenstackweave

I'm trying to learn k8s and since I happen to have access to OpenStack cloud I figured I'll try to install k8s on it, following this wiki.
So far I was able to initialize cluster, install weave CNI, connected an external worker and install OpenStack cloud controller manager. According to above Wiki, now I should wait for all pods in kube-system namespace to be running. I'm stuck with coredns pods though… They wouldn't move from Pending state.
From the pod's describe I can see that my problem is that master node stil has below taint:
node-role.kubernetes.io/master:NoSchedule
When I check the status of the node, it seems fine:

ubuntu@master-node-01:~$ kubectl get nodes -o wide
NAME             STATUS   ROLES    AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION       CONTAINER-RUNTIME
master-node-01   Ready    master   10h   v1.17.0   10.99.53.6    <none>        Ubuntu 18.04.5 LTS   4.15.0-143-generic   docker://20.10.2
worker-node-01   Ready    <none>   10h   v1.17.0   10.99.53.5    <none>        Ubuntu 18.04.5 LTS   4.15.0-143-generic   docker://20.10.2

All the pods (except for coredns ones) are running fine:

ubuntu@master-node-01:~$ kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE   IP           NODE             NOMINATED NODE   READINESS GATES
kube-system   coredns-6955765f44-g2jnm                   0/1     Pending   0          10h   <none>       <none>           <none>           <none>
kube-system   coredns-6955765f44-wj7xb                   0/1     Pending   0          10h   <none>       <none>           <none>           <none>
kube-system   etcd-master-node-01                        1/1     Running   0          11h   10.99.53.6   master-node-01   <none>           <none>
kube-system   kube-apiserver-master-node-01              1/1     Running   0          11h   10.99.53.6   master-node-01   <none>           <none>
kube-system   kube-controller-manager-master-node-01     1/1     Running   0          11h   10.99.53.6   master-node-01   <none>           <none>
kube-system   kube-proxy-8s8r9                           1/1     Running   0          10h   10.99.53.5   worker-node-01   <none>           <none>
kube-system   kube-proxy-vtgnz                           1/1     Running   0          10h   10.99.53.6   master-node-01   <none>           <none>
kube-system   kube-scheduler-master-node-01              1/1     Running   0          11h   10.99.53.6   master-node-01   <none>           <none>
kube-system   openstack-cloud-controller-manager-dtczj   1/1     Running   0          10h   10.99.53.6   master-node-01   <none>           <none>
kube-system   weave-net-2z5n7                            2/2     Running   2          10h   10.99.53.5   worker-node-01   <none>           <none>
kube-system   weave-net-tm9p4                            2/2     Running   1          10h   10.99.53.6   master-node-01   <none>           <none>

I find find anything suspicious in pod's logs.

OpenStack I'm using doesn't have Octavia installed (Wiki says it's needed for setting up the LB, but my problem doesn't seem to be related to that).

If anyone here is able to help me find the way to investigate (and eventually solve) this problem, it would be greatly appreciated.
Thanks.

Best Answer

It looks like problem with taints. You can try to solve the problem in several ways:

  • remove taint:
kubectl taint nodes $(hostname) node-role.kubernetes.io/master:NoSchedule-
  • edit node configuration and comment the taint part:
kubectl edit node <node_name>

You need to update the node after commenting.

  • schedule on master node without removing the taint:
apiVersion: extensions/v1beta1
kind: Deployment
...
  spec:
...
    spec:
...
      tolerations:
        - key: "node-role.kubernetes.io/master"
          effect: "NoSchedule"
          operator: "Exists"
kubectl taint nodes $(kubectl get nodes --selector=node-role.kubernetes.io/master | awk 'FNR==2{print $1}') node-role.kubernetes.io/master-
Related Topic