VPN to a Kubernetes-cluster from a remote network

calicokubernetes

I need to build a VPN connection between a network and a Kubernetes-cluster, so the applications hosted in this in this network could address to K8S-services via a secured tunnel.

So, I have a bunch of K8S-nodes in a self-hosted environment. I've added a separate server to this environment, this server works as a VPN gateway, it's connected to the same VLAN which the cluster nodes are connected to. The nodes have the following IP-addresses: 10.13.17.1/22, 10.13.17.2/22, 10.13.17.3/22 and so on. The VPN gateway has 10.13.16.253/22.

The Cluster IP CIDR is 10.233.0.0/18, the pod IP CIDR is 10.233.64.0/18.

The VPN-server supports an IPSec site-to-site connection with a remote network, 10.103.103.0/24. I use Calico as the networking manager, so I've set up my VPN server to keep BGP-sessions with all K8S-nodes. The VPN server's route table is full of prefixes announced by Calico nodes (10.233.0.0/18 is present too as well, of course), the cluster nodes have 10.103.103.0/24 and some other networks in their route tables, so BGP seems to be working fine. So far so good…

When I establish a connection to a service inside of the cluster from the VPN-server, everything is good too. The client (10.13.16.253) sends a SYN-packet to the service (10.233.10.101:1337), the worker receives this packet, changes it's destination IP-address to the IP-address of the pod (10.233.103.49:1337) and changes it's source IP-address to some IP-address (10.233.110.0) that will help the worker to receive the reply and give it back to the connection initiator. Here's what happens on the worker that receives this SYN-packet.
The SYN-packet comes to a worker:

22:04:25.866546 IP 10.13.16.253.56297 > 10.233.10.101.1337: Flags [S], seq 3575679444, win 65228, options [mss 1460,nop,wscale 7,sackOK,TS val 1385938010 ecr 0], length 0

The SYN-packed is being SNATed and DNATed and then it's being sent to the worker where the pod is running:

22:04:25.866656 IP 10.233.110.0.54430 > 10.233.103.49.1337: Flags [S], seq 3575679444, win 65228, options [mss 1460,nop,wscale 7,sackOK,TS val 1385938010 ecr 0], length 0

The reply has came:

22:04:25.867313 IP 10.233.103.49.1337 > 10.233.110.0.54430: Flags [S.], seq 2017844946, ack 3575679445, win 28960, options [mss 1460,sackOK,TS val 1201488363 ecr 1385938010,nop,wscale 7], length 0

The reply is being deSNATed and deDNATed to be sent to the connection initiator:

22:04:25.867533 IP 10.233.10.101.1337 > 10.13.16.253.56297: Flags [S.], seq 2017844946, ack 3575679445, win 28960, options [mss 1460,sackOK,TS val 1201488363 ecr 1385938010,nop,wscale 7], length 0

So, the connection is established and everyone is happy.

But when I try to connect to the same service from the external network (10.103.103.0/24) the worker who receives the SYN-packet does NOT change the source IP-address, it changes the destination IP-address only, so the packet's source IP-address is unchanged.
The SYN packet comes to a worker

21:56:05.794171 IP 10.103.103.1.52132 > 10.233.10.101.1337: Flags [S], seq 3759345254, win 29200, options [mss 1460,sackOK,TS val 195801472 ecr 0,nop,wscale 7], length 0

The SYN packet is being DNATed and being resent to the worker where the pod is running

21:56:05.794242 IP 10.103.103.1.52132 > 10.233.103.49.1337: Flags [S], seq 3759345254, win 29200, options [mss 1460,sackOK,TS val 195801472 ecr 0,nop,wscale 7], length 0

And nothing comes back in reply. 🙁

So, I see that the destination IP-address is changed, so I can see these packets on the worker where the pod is running, but there are no replies to them:

21:56:05.794602 IP 10.103.103.1.52132 > 10.233.103.49.1337: Flags [S], seq 3759345254, win 29200, options [mss 1460,sackOK,TS val 195801472 ecr 0,nop,wscale 7], length 0

The external network (10.103.103.0/24) is being advertised by the VPN server via BGP, so all the workers know that this network is accessible via 10.13.16.253. When I run the ping-test from a host in the external network (10.103.103.1) to the IP-address of the service (10.233.10.101), the test passes, VPN works fine and routing tables seem to be correct.

So, why does the network "trust" to 10.13.16.253 and doesn't trust to 10.103.103.1? And why does the worker perform SNAT and DNAT for the packets from 10.13.16.253 and does not perform SNAT for the packets from 10.103.103.1? Should I add some policies to allow this traffic?

Thanks in advance for any clues!

Best Answer

Ta-damn!

pfSense was breaking the SYN-packet's checksum:

13:53:32.286601 IP (tos 0x0, ttl 62, id 33830, offset 0, flags [DF], proto TCP (6), length 60)
    10.103.103.1.47390 > 10.233.10.101.1337: Flags [S], cksum 0x86e4 (incorrect -> 0x99db), seq 4230752647, win 29200, options [mss 1460,sackOK,TS val 598846881 ecr 0,nop,wscale 7], length 0
        0x0000:  4500 003c 8426 4000 3e06 31e0 0a67 6701  E..<.&@.>.1..gg.
        0x0010:  0ae9 0a65 b91e 0539 fc2c 2987 0000 0000  ...e...9.,).....
        0x0020:  a002 7210 86e4 0000 0204 05b4 0402 080a  ..r.............
        0x0030:  23b1 ada1 0000 0000 0103 0307            #...........

I've disabled the hardware checksum offload feature and now everything works smoothly.

Lots of thanks to y'all for your time and attention!

Related Topic