VPN Endpoint – How to Pass Web Traffic to VPN Other Than OpenVPN

iproute2linuxnftablesopenvpnwireguard

I have a Linux server that is an OpenVPN endpoint, but also hosts a webserver.
When my client connects to the server address for the webserver, the packets travel outside the VPN. Rightly so, since the route to the server set by OpenVPN is more specific than the default route to enter the VPN.
However I see that as a "leak".

Hence I tried to setup a similar setup as Wireguard does (Wireguard is great, but I need OpenVPN because it needs to be TCP).

I based my setup on the Wireguard page, as well as on other questions:
Prevent routing loop with FwMark in Wireguard
(Hat off for the lecture held there !)
Routing fwmark to VPN gateway using nftables mark

Despite the setup, Wireshark shows the http/https requests still go through the physical interface and not through the vpn tun0 interface.
When I look at the packet marks with nft monitor trace, it seems the meta mark is properly set and only the appropriate packets (to/from port 1194) appear.

So I suspected this is:

  • the pbr rule that does not work as expected.
  • the packet marking that does not happen early enough.

I tried to change the chain to mark outgoing packets as:

  • type route hook output
  • type filter hook output
  • –> with no more luck

These commands return the following:

- ip rule:
0:  from all lookup local
32764:  from all lookup main suppress_prefixlength 0
32765:  not from all fwmark 0x4 lookup vpn
32766:  from all lookup main
32767:  from all lookup default

- ip route show table vpn:
default dev tun0 scope link

- ip route:
default via 10.8.0.1 dev tun0 proto static metric 50 
default via 192.168.1.1 dev wlp4s0 proto dhcp src 192.168.1.10 metric 600 
10.8.0.0/24 dev tun0 proto kernel scope link src 10.8.0.2 metric 50 
END.POINT.IP.ADDRESS via 192.168.1.1 dev wlp4s0 proto static metric 50 
192.168.1.0/24 dev wlp4s0 proto kernel scope link src 192.168.1.10 metric 600 

-nft list ruleset:
table inet vpn {
    chain premangle {
        type filter hook prerouting priority mangle; policy accept;
        ip saddr END.POINT.IP.ADDRESS tcp sport 1194 meta nftrace set 1
        meta mark set ct mark
    }

    chain postmangle {
        type filter hook postrouting priority mangle; policy accept;
        ip daddr END.POINT.IP.ADDRESS tcp dport 1194 meta nftrace set 1
        ip daddr END.POINT.IP.ADDRESS tcp dport 1194 meta mark set 0x00000004
        meta mark 0x00000004 ct mark set meta mark
    }
}

- traceroute -n --fwmark=0x4 END.POINT.IP.ADDRESS
    shows it goes via the physical interface out of the vpn (as expected)
    
- traceroute -n END.POINT.IP.ADDRESS
    shows it goes via the physical interface out of the vpn (UNWANTED)

Thank you so much in advance !

Best Answer

If not using Strict Reverse Path Forwarding ("SRPF"), then no nftables should be used at all.

While routed (forwarded) traffic usually works fine when marks are handled in iptables or nftables, locally initiated rerouted traffic because of a mark (in type route hook output chain) usually gets issues: the reroute check which happens in the type route hook output chain won't magically change the local source IP address that was already chosen on the client socket. Usually it's the wrong IP address. It thus usually requires a NAT bandaid (that would be needed in type nat hook output) and will probably get UDP handling even more difficult than it already is in a multi-homed environment. Using nftables for this should be avoided whenever possible.

Just as WireGuard, OpenVPN can adequately set the firewall mark itself on its envelope outgoing traffic, and this will then happen before any route lookup happens for locally outgoing traffic:

--mark value

Mark encrypted packets being sent with value. The mark value can be matched in policy routing and packetfilter rules. This option is only supported in Linux and does nothing on other operating systems.

This works the same as WireGuard: the outgoing envelope packets, on the real interface, get the mark, probably by having the client use SO_MARK on its socket before connecting to the server:

SO_MARK (since Linux 2.6.25)

Set the mark for each packet sent through this socket (similar to the netfilter MARK target but socket-based). Changing the mark can be used for mark-based routing without netfilter or for packet filtering.

Of course if neither rerouting nor direct use of policy routing, including direct marking (with SO_MARK or an equivalent method) are in place, chances are it won't work at all.


So delete all nftables rules:

nft delete table inet vpn

and instead add in the client configuration:

mark 4

Keep the routing rules and table (they should probably be integrated in VPN hooks):

- ip rule:
0:  from all lookup local
32764:  from all lookup main suppress_prefixlength 0
32765:  not from all fwmark 0x4 lookup vpn
32766:  from all lookup main
32767:  from all lookup default
- ip route show table vpn:
default dev tun0 scope link

Note: the parts at the end of this answer, only for the SRPF case, should be added before adding the routing table entry above to avoid temporary disruption.

Do not add a default route through the VPN nor an explicit route to the remote endpoint. Don't have the server push this configuration. Or have the client ignore it with:

pull-filter ignore redirect-gateway

or:

route-nopull

In order that these routes don't appear:

default via 10.8.0.1 dev tun0 proto static metric 50 
END.POINT.IP.ADDRESS via 192.168.1.1 dev wlp4s0 proto static metric 50 

but only this one gets added:

10.8.0.0/24 dev tun0 proto kernel scope link src 10.8.0.2 metric 50 

Instead the policy routing rules will handle the default route by selecting the routing table vpn only when adequate.


As explained in my answer to the 1st linked Q/A, most of the nftables ruleset for WireGuard's Table = auto + AllowedIPs = 0.0.0.0 is to handle SRPF for reply traffic. There are a few cases:

  • rp_filter=0 everywhere

    including net.ipv4.conf.default.rp_filter and net.ipv4.conf.all.rp_filter. No RPF check: nothing to do. No nftables needed.

  • rp_filter=1

    Now envelope reply traffic can fail SRPF

    • Either choose Loose RPF on the main interface:

      sysctl -w net.ipv4.conf.wlp4s0.rp_filter=2
      

      and be done with it. No nftables needed,

    • or implement all the logic to mark return envelope traffic just as is done in WireGuard

      • Have the fwmark also be used in reverse path lookup

        by enabling src_valid_mark on main interface (could have been made on all instead), thus allowing SRPF to pass:

        sysctl -w net.ipv4.conf.wlp4s0.src_valid_mark=1
        
      • Transpose (IPv4 only here) WireGuard's setup

        as seen in linked Q/A with additional corner cases described at the end also accounted for, so reply traffic gets the fwmark

        table ip vpn {
            chain preraw {
                type filter hook prerouting priority raw; policy accept;
                iifname != "tun0" ip daddr 10.8.0.2 fib saddr type != local drop
            }
        
            chain premangle {
                type filter hook prerouting priority mangle; policy accept;
                ct mark 4 meta mark set ct mark
            }
        
            chain postmangle {
                type filter hook postrouting priority mangle; policy accept;
                meta mark 4 ct mark set meta mark
            }
        }
        

        Chain preraw is optional and can be removed if needed. It protects against remote (LAN) attempts to access the internal VPN local address.

        The mark is created by OpenVPN on outgoing envelope packets, copied into the connmark at hook postrouting, and re-injected into reply envelope packets at hook prerouting. No endpoint address or port appears anywhere.

        No rerouting is done (no type route hook output nor type nat hook output present).

      Note: the sysctl command and the nftables ruleset above should both be executed before adding the default route in the routing table vpn or temporary loss of connectivity will happen until the VPN TCP socket recovers (still, only once both are added).


The client system can now reach the server from within the tunnel.

Connectivity tests can be done like this:

socat -d -d TCP4:END.POINT.IP.ADDRESS:443 -

OP's tcpdump should reach END.POINT.IP.ADDRESS in a single hop: through the VPN.

At least on an amd64 (x86-64) architecture, the VPN can be bypassed (as root) with:

socat -d -d TCP4:END.POINT.IP.ADDRESS:443,setsockopt-listen=1:36:L4 -

where setsockopt-listen means: use SO_MARK before connecting (rather than listening, for this case). and the 4 in L4 is the same mark value as used by OpenVPN.


Note: the specific case of the client querying through the tunnel an UDP service on the server with a server's public IP address can hit a common issue not really related to VPN but to using UDP and being multi-homed. This requires the UDP service to be multi-homed aware: usually either by using multiple UDP sockets, binding once for each local address (so usually at least once per interface) or with a single unbound UDP socket by using IP_PKTINFO with additional handling code.