Ubuntu – Kubernetes OCI runtime exec failed – starting container process caused “exec: \”etcdctl\”: executable file not found in $PATH”: unknown

dockeretcdkubernetesUbuntu

Background

Created a fresh Kubernetes cluster using kubeadm init --config /home/kube/kubeadmn-config.yaml --upload-certs and then joining the 2nd control plane node by running the below.

kubeadm join VIP:6443 --token <token> \
    --discovery-token-ca-cert-hash sha256:<hash> \
    --control-plane --certificate-key <key> \
    --v=5

Question

Is etcdctl commands supposed to come back with a return value? Either using the command directly or using the docker exec method shown below. I have these packages installed kubeadm, kubectl, kubelet, and docker.

Kubectl version: 1.20.1
OS: Ubuntu 18.04

Commands from the first node

Command

etcdctl cluster-health

Response

cluster may be unhealthy: failed to list members
Error:  client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 127.0.0.1:4001: connect: connection refused
; error #1: EOF

error #0: dial tcp 127.0.0.1:4001: connect: connection refused
error #1: EOF

Command

docker container ls | grep k8s_POD_etcd

Response

k8s_POD_etcd-<nodename>_kube-system_<docker container id>

Command

docker exec -it k8s_POD_etcd-<nodename>_kube-system_<docker container id> etcdctl --endpoints=https://<node ip>:2379 --key=/etc/kubernetes/pki/etcd/peer.key --cert=/etc/kubernetes/pki/etcd/peer.crt --cacert=/etc/kubernetes/pki/etcd/ca.crt member list

Response

OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "exec: \"etcdctl\": executable file not found in $PATH": unknown

EDIT

Upgraded to v3.2 etcdctl API

Command

etcdctl endpoint status

Response

Failed to get the status of endpoint 127.0.0.1:2379 (context deadline exceeded)

Best Answer

The error mentioned by OP is caused by non existing etcdctl exacutable in container.

Why? Because he used the wrong container. Look at the following command:

docker container ls | grep k8s_POD_etcd
be510c179ced   k8s.gcr.io/pause:3.2   "/pause"  2 days ago  Up 2 days   k8s_POD_etcd-minikube_kube-system_2315889f8b2b54f1b9d43feafe941d01_0

Notice the container is k8s.gcr.io/pause:3.2. It's not an etcd container.

But why?? what is this pause container? I won't answer this question because somebody already answered it here: what-are-the-pause-containers.

I will try to answer a better question: Where is the actual etcd container?

Let's have a look at the output of the same command but with slightly modified grep command; lets grep for etcd:

docker container ls | grep etcd
c989e7d1d25b   0369cf4303ff           "etcd --advertise-cl…"   2 days ago       Up 2 days k8s_etcd_etcd-minikube_kube-system_2315889f8b2b54f1b9d43feafe941d01_0
be510c179ced   k8s.gcr.io/pause:3.2   "/pause"                 2 days ago       Up 2 days k8s_POD_etcd-minikube_kube-system_2315889f8b2b54f1b9d43feafe941d01_0

Now we have two lines of output, one is the previously found pause container, and the second one is our etcd container with a name starting with k8s_etcd_etcd. Let's see if we can run docker exec on this container:

$ docker exec -it k8s_etcd_etcd-<nodename>_kube-system_<docker container id> etcdctl version
etcdctl version: 3.4.13
API version: 3.4

Yes, we can!


To summarize: it looks like you were looking at the wrong container from the very beginning.

Related Topic