Can’t reach cluster virtual IP from pods, but can from worker nodes

kubernetes

I'm having trouble with pods not being able to 'talk' to cluster IPs (virtual IPs fronting pods) in my Kubernetes cluster.

I've been following along with "Kubernetes the hard way" by Kelsey Hightower, however I've converted it all to run the infrastructure in AWS.

I have pretty much everything working, except I have a problem where my pods are unable to talk to clusterIP virtual IPs.

  • service-cluster-ip-range is: 10.32.0.0/24
  • Pod CIDR for worker nodes is:
    10.200.0.0/16

I've tried with both CoreDNS and Kube-dns initially, thinking it might have been an issue at that level, however I've since diagnosed down to the fact that I cannot talk to service cluster IPs from pods, but on the actual worker nodes I can indeed talk to cluster IPs.

I've verified that kube-proxy is working as expected. I'm running that in iptables mode and can see it writing out iptables rules on worker nodes correctly. I even tried switching to ipvs mode and in that mode it also wrote out rules correctly.

If I do nslookup inside a test pod (e.g. busybox 1.28) and let it use it's standard nameserver setting pointing to my coredns installation, it fails to resolve google.com or the clusterkubernetes.default`. However if I tell nslookup to use the POD IP address of the coredns pod, it works just fine.

Example

This does not work:

kubectl exec -it busybox -- nslookup google.com               
Server:    10.32.0.10
Address 1: 10.32.0.10

nslookup: can't resolve 'google.com'
command terminated with exit code 1

This works (pointing nslookup to coredns pod IP address rather that cluster IP):

kubectl exec -it busybox -- nslookup google.com 10.200.2.2                   
Server:    10.200.2.2
Address 1: 10.200.2.2 kube-dns-67d45fcb87-2h2dz

Name:      google.com
Address 1: 2607:f8b0:4004:810::200e iad23s63-in-x0e.1e100.net
Address 2: 172.217.164.142 iad30s24-in-f14.1e100.net

To clarify, I've tried this with both CoreDNS and kube-dns – same result in both cases. It seems like a higher up networking issue.

My AWS EC2 instances have source/destination checking disabled. All my configuration and settings are forked from the official kubernetes-the-hard-way repo, but I've updated things to run on AWS. Source with all my config / settings etc is here

Edit: providing the /etc/resolv.conf that my pods are getting from kube-dns / coredns for info (this looks absolutely fine though):

# cat /etc/resolv.conf
search kube-system.svc.cluster.local svc.cluster.local cluster.local ec2.internal
nameserver 10.32.0.10
options ndots:5

I am able to ping the kube-dns pod IP directly from pods, but the cluster IP for kube-dns does not work for ping or anything else. (same for other services with cluster IPs). E.g.

me@mine ~/Git/kubernetes-the-hard-way/test kubectl get pods -n kube-system -o wide
NAME                                READY   STATUS    RESTARTS   AGE    IP            NODE             NOMINATED NODE   READINESS GATES
hello-node1-55cc74b4b8-2hh4w        1/1     Running   2          3d1h   10.200.2.14   ip-10-240-0-22   <none>           <none>
hello-node2-66b5494599-cw8hx        1/1     Running   2          3d1h   10.200.2.12   ip-10-240-0-22   <none>           <none>
kube-dns-67d45fcb87-2h2dz           3/3     Running   6          3d1h   10.200.2.11   ip-10-240-0-22   <none>           <none>

 me@mine ~/Git/kubernetes-the-hard-way/test kubectl exec -it hello-node1-55cc74b4b8-2hh4w sh
Error from server (NotFound): pods "hello-node1-55cc74b4b8-2hh4w" not found
 me@mine ~/Git/kubernetes-the-hard-way/test kubectl -n kube-system exec -it hello-node1-55cc74b4b8-2hh4w sh
# ping 10.200.2.11
PING 10.200.2.11 (10.200.2.11) 56(84) bytes of data.
64 bytes from 10.200.2.11: icmp_seq=1 ttl=64 time=0.080 ms
64 bytes from 10.200.2.11: icmp_seq=2 ttl=64 time=0.044 ms
64 bytes from 10.200.2.11: icmp_seq=3 ttl=64 time=0.045 ms
^C
--- 10.200.2.11 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
rtt min/avg/max/mdev = 0.044/0.056/0.080/0.017 ms

# ip route get 10.32.0.10
10.32.0.10 via 10.200.2.1 dev eth0  src 10.200.2.14
    cache
#

Am I missing something obvious here?

Best Answer

Try to add the following to kube-dns ConfigMap

data:
  upstreamNameservers: |
    [“8.8.8.8”, “8.8.4.4”]