IPTables – IP Forwarding for LVS NAT Mode

iptableskeepalivednat;

I am learning about the L3 loadbalancers and now working on the keepalived NAT mode.The packet flow should be Client ——> L3(lvs DNAT)——> real servers and the return flow should be exact opposite to request flow.But the lvs DNAT changes only the destination NAT,means only the the destination IP of the packet is changed ie) [source=client ip,destination=real server ip].So after the real server process the request it forwards it to default gateway instead of the L3 server and the packet gets dropped.I dont want to set L3 server as the default gateway of the real server,instead I need an IP table related solution to forward the traffic from Real Server to the L3 server when it comes to the port 8443 and 8080.

IP tables provide the way to forward the traffic before processing the packet via PREROUTING but what I want is forward traffic to L3 after the server process the request.
I tried with the OUTPUT chain still with no luck sudo iptables -t nat -A OUTPUT -p tcp --match multiport --sports 8443,8080 -j DNAT --to 172.24.248.201

172.24.248.201 is the ip of the L3 server.

Thanks in Advance

Best Answer

This is a routing problem. iptables doesn't route, but routing can be affected by the addresses that it can change and that's why iptables is often mistakenly assumed to do routing. Because of this iptables can't be the solution to solve the problem (but can sometimes be part of such solution).

Moreover any attempt to alter a reply packet using Netfilter/iptables' NAT is ignored: the nat table is only triggered for the first packet of a flow: not for a reply: iptables -t nat -A OUTPUT -p tcp --match multiport --sports 8443,8080 -j DNAT --to 172.24.248.201 will never match. And if it could be made to match, this wouldn't work, because the destination of the reply has to be the source of the initial query seen by the real server: the actual remote client IP address, not the L3 LB.

Below is a simple method to do it correctly on the real server: with policy routing.


In this specific problem and answer I'll assume Linux kernel >= 4.17 for "Extends fib rule match support to include sport, dport and ip proto match", which avoids a complex use of iptables to mark packets (but would still require policy routing anyway).

One has to ensure real server's dedicated ports can't overlap the real server's dynamic port range (used when the server is a client, like for DNS queries or system upgrade):

# sysctl net.ipv4.ip_local_port_range
net.ipv4.ip_local_port_range = 32768    60999

Ensure that the lower port is above the real server's highest port used for a service. Here: 32768 > 8443 is good enough.

Add a routing table for specific traffic intended to use the L3 LB as gateway. Table 248201 is an arbitrarily chosen value. I'm assuming the real server's interface is named eth0, please adapt:

ip route add default via 172.24.248.201 dev eth0 table 248201

Add the selectors: one per port range. It could be a single wide range (as long as there's no interaction with dynamic ports, it doesn't matter) or X rules for X distinct ports. For a single range:

ip rule add iif lo ipproto tcp sport 8000-8443 lookup 248201

where iif lo is the special syntax for locally initiated outgoing traffic and is not really about the lo interface.

Now all intended return traffic will be routed using the L3 LB as gateway. Any other traffic will use the usual default gateway if needed.


I don't think SRPF should be an issue if enabled unless the server's usual default gateway uses an other interface. If this were to be an issue, it can be set to Loose mode instead.

To be done only if it doesn't work without it (and even then, eth0 may be enough instead of all below):

sysctl -w net.ipv4.conf.all.rp_filter=2