(Reposted from original post at: https://stackoverflow.com/questions/73012913/kubernetes-pull-from-image-private-network-fails-to-respect-etc-hosts-of-serv as this is a more appropriate place to ask the question)
I am running a small 3 node test kubernetes cluster (using kubeadm) running on Ubuntu Server 22.04, with Flannel as the network fabric. I also have a separate gitlab private server, with container registry set up and working.
The problem I am running into is I have a simple test deployment, and when I apply the deployment yaml, it fails to pull the image from the gitlab private server.
apiVersion: apps/v1
kind: Deployment
metadata:
name: platform-deployment
spec:
replicas: 1
selector:
matchLabels:
app: platform-service
template:
metadata:
labels:
app: platform-service
spec:
containers:
- name: platform-service
image: registry.examle.com/demo/platform-service:latest
Ubuntu Server: /etc/hosts (the relevant line)
192.168.1.30 registry.example.com
The Error
Failed to pull image "registry.example.com/demo/platform-service:latest":
rpc error: code = Unknown desc = failed to pull and unpack image
"registry.example.com/deni/platform-service:latest": failed to resolve reference
"registry.example.com/demo/platform-service:latest": failed to do request: Head
"https://registry.example.com/v2/demo/platform-service/manifests/latest": dial tcp
xxx.xxx.xxx.xxx:443: i/o timeout
The 'xxx.xxx.xxx.xxx' is related to my external network, to which exists a domain name in the DNS, however all of my internal networks are set up to attach to the internal network representation, and 'registry.example.com' is a representation of my own domains.
Also to note:
docker pull registry.example.com/demo/platform-service:latest
From the command line of the server, works perfectly fine, it is just not working from kubernetes deploy yaml.
The problem
While the network on the server, and the host files on the server are configured correctly, the docker image is not resolving because when I apply it is not using the correct IP (that is configured in hosts), rather a public IP that is a different server. And the reason for the timeout is because the public facing server is not set up the same.
When I run kubectl apply -f platform-service.yaml
why does it not respect the hosts file of the server, and is there a way configure hosts inside Kubernetes.
(If this problem is not clear, I apologize, I am quite new, and still learning terminology, maybe why google is not helping me with this problem.)
The closest S/O I could find is:
(SO Answer #1): hostAliases (this is for the pod itself, not pulling the image), also, installed through apt/package manager rather than snap. With the rest of the answer suggests changing the distribution, which I would rather go with my current setup than change it.
— Update(s):
- I have narrowed down the problem (I believe) to needing settings in
containerd
, but have not yet found how to set the hosts to match the server's/etc/hosts
file - I created a second kubernetes cluster, using k3s instead of kubeadm: instructions found at https://computingforgeeks.com/install-kubernetes-on-ubuntu-using-k3s/ and am encountering the same problem.
Update
Attempts to add hosts to coredns not working either:
(https://stackoverflow.com/questions/65283827/how-to-change-host-name-resolve-like-host-file-in-coredns)
kubectl -n kube-system edit configmap/coredns
...
.:53 {
errors
health {
lameduck 5s
}
ready
hosts custom.hosts registry.example.com {
192.168.1.30 registry.example.com
fallthrough
}
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
forward . /etc/resolv.conf {
max_concurrent 1000
}
cache 30
loop
reload
loadbalance
}
...
deleted the coredns pods (so they are recreated)
and still the docker pull on the deployment fails with the external ip address instead of the internal address.
Best Answer
After going through many different solutions and lots of research and testing. The answer was actually very simple.
Solution in my case
The /etc/hosts file MUST contain the host for the registry (and possibly the entry for the gitlab instance as well) on EVERY node of the cluster including the master node.
Once I included that on each of the 2 slaves, it attempted to pull the image, and failed with credential issues (which I was expecting to see once the hosts issue was resolved). From there I was able to add the credentials and now the image pulls fine from the private registry rather than the public facing registry.
Bonus: Fix for credentials error connecting to private registry (not part of the original question, but part of the setup process for connecting)
After fixing the /etc/hosts issue, you will probably need to set up 'regcred' credentials to access the private registry, Kubernetes documentation provides the steps on that part:
https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/