Iptables: combine SNAT with network remapping for OpenVPN

iptablesnat;openvpnrouting

[Apologies for the long prelude; question halfway down.]

I have a working OpenVPN setup whereby a VPN server pushes a route back to one client (hereafter called the “router”) which can then expose its own subnet to the machine running the server as well as to other machines running the VPN client. This is accomplished just by making the router use the SNAT target from iptables. So for example, say the VPN server and other undistinguished clients are on the 10.0.77.0/24 network, the VPN creates a tun0 interface covering 192.168.252.0/24, and the private subnet is 192.168.33.0/24. The OpenVPN server configuration says (among other things)

client-to-client
route 192.168.33.0 255.255.255.0
push "route 192.168.33.0 255.255.255.0"

When the Linux “router” machine, 192.168.33.10 let us say, connects to the VPN it gets pushed a route, so its table looks like

192.168.33.0    *               255.255.255.0   U     0      0        0 eth1
192.168.252.0   192.168.252.5   255.255.255.0   UG    0      0        0 tun0
192.168.252.5   *               255.255.255.255 UH    0      0        0 tun0

and then it is configured to run

sysctl -w net.ipv4.ip_forward=1
iptables -t nat -A POSTROUTING -s 192.168.252.0/24 -j SNAT --to-source 192.168.33.10

It can also add some iptables rules to create a firewall, but the above suffices to let the machine running the OpenVPN server (or another client) connect to, say, 192.168.33.11: the packets are sent via tun0 to the router, which uses SNAT to set the source IP to its own 192.168.33.10 and then forward the packets to its sibling machine 192.168.33.11. Reply packets are sent to the router, which then forwards them back through the tunnel and all is well. So for example on 192.168.33.11 I can

nc -l localhost 9999

and from the 10.0.77.13 (some other VPN client) I can

nc 192.168.33.11 9999

and make a connection. So far so good.

Note that no changes are being made to the physical router machine on either network; the “router” in quotes (a random machine running a VPN client) needs to use SNAT so that other machines on its network are able to send reply packets back through the VPN. For purposes of this question I am not “in control” of either network: I can only add some machines with their own routing rules and VPN clients or servers.

Now for the problem: let us say the two networks (neither on the public Internet) in fact overlap in their actual ranges. So in this example, the private subnet is not 192.168.33.11 but also 10.0.77.0/24. And let us assume that renumbering either network is simply not an option. So besides using SNAT to allow the router to forward packets, I need to remap the router’s private network to a different IP range from the perspective of the OpenVPN server. Let us say I pick 10.0.78.0/24 as the virtual network range:

route 10.0.78.0 255.255.255.0
push "route 10.0.78.0 255.255.255.0"

and from the OpenVPN server machine I want connections made to 10.0.78.11 to go through the tunnel and be processed with SNAT as before, but I also want the destination address in the router’s subnet to be 10.0.77.11. How can I configure iptables to do this?

The NETMAP target looked like it was what I wanted, but I could not get it to work:

iptables -t nat -A PREROUTING -i tun0 -j NETMAP --to 10.0.77.0/24
iptables -t nat -A POSTROUTING -s 192.168.252.0/24 -j SNAT --to-source 10.0.77.10

From the OpenVPN server machine I can ping 10.0.78.10 (i.e. the remapped address of the router) when the NETMAP target is added, so it is doing something.

tcpdump -i tun0

run on the router during this ping shows ICMP echo request and ICMP echo reply as expected. But if I try to ping 10.0.78.11 (i.e. the remapped address of a sibling machine on the router’s subnet), I do not get replies, and tcpdump on the router shows requests but not replies; tcpdump on 10.0.77.11 (the sibling machine) shows no packets at all. Also running on the router:

iptables -t nat -L -v

shows packets being processed by NETMAP but none by SNAT. So it seems like the NETMAP target is somehow superseding the SNAT target?

Essentially what I want is that when the router receives a packet on tun0 from e.g. 192.168.252.5 (the OpenVPN server’s address on the tunnel) destined for 10.0.78.11, it is rewritten in two ways: the destination address is changed to 10.0.77.11, and the source address is changed to 10.0.77.10 (with a new source port being picked so that SNAT can keep track of which connection this is). Then when 10.0.77.11 sends a reply to the fake port on 10.0.77.10, the router should reverse the process, sending a packet back to 192.168.252.5 on tun0 with the faked source address of 10.0.78.11. Can NETMAP do this, or can any other target in iptables do this?

Other things I tried without success: configuring the router machine for proxy ARP; adding the virtual network range to the routing table. But it feels like such things should not be necessary.

Update: I do not care about DNS in this context at all—only a handful of machines in the “private subnet” need to be contacted, so using IP addresses is acceptable.

Best Answer

This is a big mess and you would honestly be better served by renumbering than by creating a web of ugly NATs with conflicting addresses. The other solutions you will need to support this (eg. split-horizon DNS with multiple zones, possibly with automatic updating) will be difficult and messy and every subsequent person who has to deal with this network will curse your name forever and burn effigies of you and your team.

Nonetheless, it looks like the problem you are having is that the hosts on one side of the NAT (see? I can't even describe the sides of the NAT properly, because it is messy) don't have a path back to hosts on the other side. You have to add a NETMAP rule for the return path too (packets incoming from eth1 I presume).

But wait! That's done in the prerouting chain. So, the destination address will be set before the routing decision is made, which is pretty much the way it has to work... but your router has two routes for the "conflicting" network. So, it will prioritize by the usual means (narrowest matching prefix first; metric to break ties), causing some packets to be reflected back onto the network they came from instead of being routed across the NAT. They will be source-natted too. Unfortunately, the source address isn't used in routing, so you can't use the source address. The input interface isn't used in routing either, and once the routing decision is made you can't rewrite the destination address again.

So, you find, this cannot be done. You have two options. Preferably, you would renumber one of the subnets with the network address conflict because you are a good network engineer rather than an incompetent one and you make networks that are not unnecessarily complicated. Renumbering VPN subnets tends to be an uncomplicated task which at worst will require the use of sed, but perhaps you have one of those weird situations where renumbering either subnet is an inhumanly difficult task. OK.

If for whatever reason you can't do that, you will need an additional router to go with your split-horizon DNS. Set up a router for each subnet (in effect this means one router for the VPN clients, and one router for the hosts on the network which unfortunately uses the prefix you'd staked out for the VPN). Pick some other address range (and it would have to be one you could otherwise renumber one of the subnets to...) to be the "foreign" one.

Now, suppose we call the network with the VPN hosts side A, and the network with the non-VPN hosts side B. Suppose on router A you choose the "foreign" prefix to be 192.0.2.0/24, and on router B, it's 198.51.100.0/24. These will be the fake prefixes you use for hosts on that subnet to contact hosts with the conflicted prefix on the other subnet (don't use these they're reserved for documentation purposes in RFC5737).

The below rules are complicated because you can't use the input interface as a predicate in the POSTROUTING chain, and the source subnet is non-unique, so we have to prevent spoofing using a filter rule. It's also confusing because NETMAP decides whether to change the source or destination address based on what chain it's in.

In router A and B, add a rule for incoming traffic on the point to point link between the routers which I shall call ptp0:

router-a # iptables -t nat -A PREROUTING  -i ptp0 -d 198.51.100.0/24 -j NETMAP --to 10.0.77.0/24
router-a # iptables -t nat -A POSTROUTING         -d 10.0.77.0/24    -j NETMAP --to 192.0.2.0/24
router-a # iptables -t filter -A FORWARD -i \!ptp0 -s 10.0.77.0/24 -j DROP
router-b # iptables -t nat -A PREROUTING  -i ptp0 -d 192.0.2.0/24    -j NETMAP --to 10.0.77.0/24
router-b # iptables -t nat -A POSTROUTING         -d 10.0.77.0/24    -j NETMAP --to 198.51.100.0/24
router-b # iptables -t filter -A FORWARD -i \!ptp0 -s 10.0.77.0/24 -j DROP

Now, this part of the problem is solved, because you no longer have a single router which has both prefixes in its routing table. No solution which requires you to have the same prefix referring to two different networks with two different hosts on them in the same routing table can work; it is intrinsically impossible because of how routing works.

Oh, and I almost forgot - I believe your SNAT rule is not getting processed because nothing is matching it, but it's important to remember that once a connection gets stored in the NAT state table, it no longer gets processed by the SNAT rule and no longer gets counted in the statistics if I recall.

If you don't actually require that the subnets be separate, just use layer 2 bridging. It will make your life a lot easier.

Related Topic