Linux – How to optimize throughput on a linux NAT/router

linuxnetworkingperformance-tuningrouter

I am trying to use an old Fujitsi RX300S2, with a quad core Intel Xeon CPU @2.80GHz as a Gitabit NAT router, it has a dual gigabit NIC on board over PCI-X.

The router will also forward multicast traffic from the external interface to the internal network. Multicast routing is handled by the upstream Cisco router so the NAT router only has to "leak" multicast traffic between eth1 (upstream) and eth0 (internal).

This has been properly setup using igmpproxy which basically makes the L3 router act as a L2 bridge according to multicast traffic.

When testing the throughput, I have no problem receiving ~850-900Mbit multicast traffic on 200 groups/streams (approx 80'000 p/s) to a local process in userspace, which also analyses the 200 streams in realtime without packet loss. The local process maxes one core at 100%.

The streams consists of IPTV mpeg transport streams encapsulated in IP UDP packets. 7×188=1316 bytes of payload.

But when testing the throughput in forwarding mode, e.g multicast traffic enters eth1 and is routed at kernel level to eth0 and sent into the local network, the NAT router cannot forward all traffic received.

The external interface eth1 receives all multicast traffic ~900Mbit but the outgoing interface only transmits ~600Mbit and all streams suffer from packet loss according the the receiving test machine attached to eth0.

When analysing the load ksoftirqd/3 maxes out at 100% CPU but the other 3 cores are below 10% so it seems that not all 4 cores participate in the load.

The /proc/interrupts also shows that eth0 and eth1 share irq16:

    CPU0 CPU1  CPU2       CPU3
16:    0    0 92155  208280892   IO-APIC   16-fasteoi uhci_hcd:usb2, uhci_hcd:usb5, eth1, eth0

As can be seen, CPU3 handles a disproportionate amount of interrupts.

I have read through various texts regarding cpu_affinity and trying to pin CPU cores to network queues. Unfortunately this NIC tg3 from Broadcom does not support multiple queues but still, it should be possible to share the load between more cores on this quad core system.

Or is it the PCI-X bus that is the bottleneck, but if so, then the throughput should be reduced on both incoming eth1 and outgoing eth0 and packets should be dropped by eth1 but it seems that packets are lost between eth1 and eth0. Not true, since when packets are lost in the router, /sys/class/net/eth1/statistics/rx_missed_errors is incremented alot (about 1000 p/s).

When only 100 channels and approx 500 Mbit is forwarded, packet loss does not happen and ksoftirqd/3 only consumes about 5-6% CPU. But when 600Mbit is forwarded, ksoftirqd/3 consumes 100% so it seems that some bottleneck outside the CPU is hit.

Is it out of the question that an old server like this is able to forward 1Gbit of UDP traffic in one direction only between two built in NIC's? Even though the packets are large, 1316 bytes payload, which gives a moderate 80..90kp/s in 1Gbit?

Best Answer

We abandoned the server, by spec the two on board network interfaces were not supposed to drive full gigabit traffic. The second interface was indented to be used for management.

A standard desktop core i5 with PCIe and two Intel i210 gigabit adapters was able to forward 1Gbit multicast UDP traffic with no problem.

Although, it required tweaking the RX and TX buffers (ethtool -G) due to burstiness in traffic. A 2x or 4x PCIe would probably help to reduce the risk of missed packets due to PCIe bus congestion.

Related Solutions

Linux Bash – How to Sort du -h Output by Size

As of GNU coreutils 7.5 released in August 2009, sort allows a -h parameter, which allows numeric suffixes of the kind produced by du -h:

du -hs * | sort -h

If you are using a sort that does not support -h, you can install GNU Coreutils. E.g. on an older Mac OS X:

brew install coreutils
du -hs * | gsort -h

From sort manual:

-h, --human-numeric-sort compare human readable numbers (e.g., 2K 1G)

Nat – How to make a linux VM working as a router

You are on the right track with your firewall rules. I do things the "old fashioned" way - write a script, put it in /etc and call it from /etc/rc.local. However you like to do it, here's what works for me.

The OS is Debian Jessie 64 bit via netinstall with only "standard system utilities" selected at tasksel time, running in VirtualBox on a Mint desktop. The eth0 is connecting to my LAN via a bridged interface and DHCP. The eth1 is the LAN side of a network of VM machines I use for experimenting. Copy/paste the firewall script to /etc/rc.firewall, make it executable, and call it in /etc/rc.local.

#!/bin/bash
# copyright me, licensed to you freely
# a very simple set of iptables commands 
# to allow forwarding between ethernet
# devices

# make sure forwarding is enabled in the kernel
echo 1 > /proc/sys/net/ipv4/ip_forward

# where is iptables located?
iptables=`which iptables`

# flush all existing rules
$iptables -F

# this is for NAT
# enable masquerading
/sbin/iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

# don't forward packets from off-lan to lan if
# they are a brand new connection being initiated
$iptables -A FORWARD -i eth0 -o eth1 -m state --state NEW -j REJECT

# if the packets come from off-lan but they are
# related to a connection that was established from
# within the lan, go ahead and forward them
$iptables -A FORWARD -i eth0 -o eth1 -m state --state RELATED,ESTABLISHED -j ACCEPT

# whatever traffic comes from the lan to go to
# the world allow thru
$iptables -A FORWARD -i eth1 -o eth0 -j ACCEPT

When all that is done and run, you should be able to see something like

root@router:~# ifconfig eth0
eth0      Link encap:Ethernet  HWaddr 08:00:27:e6:43:df  
          inet addr:192.168.1.126  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fee6:43df/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:729 errors:0 dropped:0 overruns:0 frame:0
          TX packets:382 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:61777 (60.3 KiB)  TX bytes:46468 (45.3 KiB)

root@router:~# ifconfig eth1
eth1      Link encap:Ethernet  HWaddr 08:00:27:af:50:e2  
          inet addr:10.99.99.1  Bcast:10.99.99.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:feaf:50e2/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:10 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 B)  TX bytes:828 (828.0 B)

root@router:~# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.1.1     0.0.0.0         UG    0      0        0 eth0
10.99.99.0      0.0.0.0         255.255.255.0   U     0      0        0 eth1
192.168.1.0     0.0.0.0         255.255.255.0   U     0      0        0 eth0
root@router:~# iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         
REJECT     all  --  anywhere             anywhere             state NEW reject-with icmp-port-unreachable
ACCEPT     all  --  anywhere             anywhere             state RELATED,ESTABLISHED
ACCEPT     all  --  anywhere             anywhere            

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
root@router:~# cat /proc/sys/net/ipv4/ip_forward
1
root@router:~#

Best Answer

Related Solutions

Linux Bash – How to Sort du -h Output by Size

Nat – How to make a linux VM working as a router

Related Topic