Windows – No DNS resolution and Internet access from Kubernetes POD on Windows

domain-name-systemkubernetesnetworkingwindows

I’ve built a cluster (v1.15.4) on hyper-v with

- linux node (master)
- linux node (worker)
- windows node (worker)
- networking - flannel (host-gw)
- mac spoofing enabled for vms

deployed win-webserver.yaml to test if windows is working.
https://kubernetes.io/docs/setup/production-environment/windows/user-guide-windows-containers/

kubectl get pods -o wide 
NAME                             READY   STATUS    RESTARTS   AGE   IP           NODE   NOMINATED NODE   READINESS GATES
win-webserver-7779dc4df7-58qs2   1/1     Running   0          18m   10.42.2.41   node01   <none>           <none>
win-webserver-7779dc4df7-mb4sf   1/1     Running   0          18m   10.42.2.43   node01   <none>           <none>
win-webserver-7779dc4df7-w5kjt   1/1     Running   0          18m   10.42.2.44   node01   <none>           <none>
win-webserver-7779dc4df7-wm245   1/1     Running   0          18m   10.42.2.45   node01   <none>           <none>

kubectl get svc
NAME            TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
kubernetes      ClusterIP   10.43.0.1      <none>        443/TCP        122m
win-webserver   NodePort    10.43.91.255   <none>        80:30378/TCP   72m

Deployment ran fine however containers running on windows can’t resolve dns and access internet.

DNS request from windows pod

DNS request timed out. 
timeout was 2 seconds. 
Server:  UnKnown     
Address:  10.43.0.10 

DNS request timed out. 
timeout was 2 seconds. 
DNS request timed out. 

Also can’t access pods from linux using service.

curl http://10.43.91.255 --connect-timeout 30
curl: (28) Connection timed out after 30000 milliseconds

Flannel logs

kubectl logs -n kube-system pod/kube-flannel-8dqcc -c kube-flannel
I1015 12:48:40.437400       1 main.go:527] Using interface with name eth0 and address 192.168.x.x
I1015 12:48:40.437491       1 main.go:544] Defaulting external address to interface address (192.168.x.x)
I1015 12:48:40.538336       1 kube.go:126] Waiting 10m0s for node controller to sync
I1015 12:48:40.538380       1 kube.go:309] Starting kube subnet manager
I1015 12:48:41.538500       1 kube.go:133] Node controller sync successful
I1015 12:48:41.538539       1 main.go:244] Created subnet manager: Kubernetes Subnet Manager - node02
I1015 12:48:41.538551       1 main.go:247] Installing signal handlers
I1015 12:48:41.538676       1 main.go:386] Found network config - Backend type: vxlan
I1015 12:48:41.538751       1 vxlan.go:120] VXLAN config: VNI=4096 Port=4789 GBP=false DirectRouting=false
W1015 12:48:41.539007       1 device.go:84] "flannel.4096" already exists with incompatable configuration: vtep (external) interface: 2 vs 3; recreating device
I1015 12:48:41.632647       1 main.go:317] Wrote subnet file to /run/flannel/subnet.env
I1015 12:48:41.632666       1 main.go:321] Running backend.
I1015 12:48:41.632675       1 main.go:339] Waiting for all goroutines to exit
I1015 12:48:41.632695       1 vxlan_network.go:60] watching for new subnet leases
E1015 13:01:19.765370       1 vxlan_network.go:101] error decoding subnet lease JSON: invalid MAC address
E1015 13:11:30.468144       1 vxlan_network.go:101] error decoding subnet lease JSON: invalid MAC address

Any suggestion appreciated.

Update

I managed to get my cluster working. Reason for the communication failure was that I had two networks connected to each of kubernetes machines.

linux master - internet facing eth0 (192.168.6.2) eth1 (192.168.3.12)
linux worker - internet facing eth0 (192.168.6.3) eth1 (192.168.3.13)
windows node - internet facing Ethernet_LB (192.168.6.4) Ethernet_FW (192.168.3.14)  

Virtual Switch on windows was created on Ethernet_FW which had no internet connection.
After running Wireshark and tdcdump I was able to find out that
– pods on master send traffic to windows pods via eth0
– pods on windows node send traffic to linux pods visa Ethernet_FW
– pods from windows send traffic to the internet via Ethernet_FW
That caused communication failure between pods.
I was able to configure windows node to create Virtual Switch on Ethernet_LB which had internet access and this interface (Ethernet_LB) was accepting packets from master and linux worker.

For cluster creation I used this resource https://kubernetes.io/docs/setup/production-environment/windows/user-guide-windows-nodes/
I had to put proper interface name into config file

"InterfaceName" : "Ethernet_LB" 

Best Answer

This is a community wiki answer posted for better visibility. Feel free to expand it.

As already confirmed by @dzup4uk, Reason for the communication failure was that there was two networks connected to each of kubernetes machines.

linux master - internet facing eth0 (192.168.6.2) eth1 (192.168.3.12)
linux worker - internet facing eth0 (192.168.6.3) eth1 (192.168.3.13)
windows node - internet facing Ethernet_LB (192.168.6.4) Ethernet_FW (192.168.3.14)  

Virtual Switch on windows was created on Ethernet_FW which had no internet connection. After running Wireshark and tdcdump I was able to find out that

  • pods on master send traffic to windows pods via eth0
  • pods on windows node send traffic to linux pods visa Ethernet_FW
  • pods from windows send traffic to the internet via Ethernet_FW
    That caused communication failure between pods.
    I was able to configure windows node to create Virtual Switch on Ethernet_LB which had internet access and this interface (Ethernet_LB) was accepting packets from master and linux worker.

It was required to put proper interface name into config file

"InterfaceName" : "Ethernet_LB" 

The Kubernetes cluster v1.19.7 was build using Windows Server 2019 with the latest updates as WIndows worker nodes.

For cluster creation this resource was used:

Other useful resources:

Related Topic