I’ve built a cluster (v1.15.4) on hyper-v with
- linux node (master)
- linux node (worker)
- windows node (worker)
- networking - flannel (host-gw)
- mac spoofing enabled for vms
deployed win-webserver.yaml to test if windows is working.
https://kubernetes.io/docs/setup/production-environment/windows/user-guide-windows-containers/
kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
win-webserver-7779dc4df7-58qs2 1/1 Running 0 18m 10.42.2.41 node01 <none> <none>
win-webserver-7779dc4df7-mb4sf 1/1 Running 0 18m 10.42.2.43 node01 <none> <none>
win-webserver-7779dc4df7-w5kjt 1/1 Running 0 18m 10.42.2.44 node01 <none> <none>
win-webserver-7779dc4df7-wm245 1/1 Running 0 18m 10.42.2.45 node01 <none> <none>
kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.43.0.1 <none> 443/TCP 122m
win-webserver NodePort 10.43.91.255 <none> 80:30378/TCP 72m
Deployment ran fine however containers running on windows can’t resolve dns and access internet.
DNS request from windows pod
DNS request timed out.
timeout was 2 seconds.
Server: UnKnown
Address: 10.43.0.10
DNS request timed out.
timeout was 2 seconds.
DNS request timed out.
Also can’t access pods from linux using service.
curl http://10.43.91.255 --connect-timeout 30
curl: (28) Connection timed out after 30000 milliseconds
Flannel logs
kubectl logs -n kube-system pod/kube-flannel-8dqcc -c kube-flannel
I1015 12:48:40.437400 1 main.go:527] Using interface with name eth0 and address 192.168.x.x
I1015 12:48:40.437491 1 main.go:544] Defaulting external address to interface address (192.168.x.x)
I1015 12:48:40.538336 1 kube.go:126] Waiting 10m0s for node controller to sync
I1015 12:48:40.538380 1 kube.go:309] Starting kube subnet manager
I1015 12:48:41.538500 1 kube.go:133] Node controller sync successful
I1015 12:48:41.538539 1 main.go:244] Created subnet manager: Kubernetes Subnet Manager - node02
I1015 12:48:41.538551 1 main.go:247] Installing signal handlers
I1015 12:48:41.538676 1 main.go:386] Found network config - Backend type: vxlan
I1015 12:48:41.538751 1 vxlan.go:120] VXLAN config: VNI=4096 Port=4789 GBP=false DirectRouting=false
W1015 12:48:41.539007 1 device.go:84] "flannel.4096" already exists with incompatable configuration: vtep (external) interface: 2 vs 3; recreating device
I1015 12:48:41.632647 1 main.go:317] Wrote subnet file to /run/flannel/subnet.env
I1015 12:48:41.632666 1 main.go:321] Running backend.
I1015 12:48:41.632675 1 main.go:339] Waiting for all goroutines to exit
I1015 12:48:41.632695 1 vxlan_network.go:60] watching for new subnet leases
E1015 13:01:19.765370 1 vxlan_network.go:101] error decoding subnet lease JSON: invalid MAC address
E1015 13:11:30.468144 1 vxlan_network.go:101] error decoding subnet lease JSON: invalid MAC address
Any suggestion appreciated.
Update
I managed to get my cluster working. Reason for the communication failure was that I had two networks connected to each of kubernetes machines.
linux master - internet facing eth0 (192.168.6.2) eth1 (192.168.3.12)
linux worker - internet facing eth0 (192.168.6.3) eth1 (192.168.3.13)
windows node - internet facing Ethernet_LB (192.168.6.4) Ethernet_FW (192.168.3.14)
Virtual Switch on windows was created on Ethernet_FW which had no internet connection.
After running Wireshark and tdcdump I was able to find out that
– pods on master send traffic to windows pods via eth0
– pods on windows node send traffic to linux pods visa Ethernet_FW
– pods from windows send traffic to the internet via Ethernet_FW
That caused communication failure between pods.
I was able to configure windows node to create Virtual Switch on Ethernet_LB which had internet access and this interface (Ethernet_LB) was accepting packets from master and linux worker.
For cluster creation I used this resource https://kubernetes.io/docs/setup/production-environment/windows/user-guide-windows-nodes/
I had to put proper interface name into config file
"InterfaceName" : "Ethernet_LB"
Best Answer
This is a community wiki answer posted for better visibility. Feel free to expand it.
As already confirmed by @dzup4uk, Reason for the communication failure was that there was two networks connected to each of kubernetes machines.
Virtual Switch on windows was created on Ethernet_FW which had no internet connection. After running Wireshark and tdcdump I was able to find out that
That caused communication failure between pods.
I was able to configure windows node to create Virtual Switch on Ethernet_LB which had internet access and this interface (Ethernet_LB) was accepting packets from master and linux worker.
It was required to put proper interface name into config file
The Kubernetes cluster v1.19.7 was build using Windows Server 2019 with the latest updates as WIndows worker nodes.
For cluster creation this resource was used:
Other useful resources:
Windows Server release information
Windows 10 and Windows Server 2019 update history