Nat – Amazon EC2 VPC: NAT instance download speed performance drop

amazon ec2amazon-vpcnat;

I have a set of servers inside Amazon EC2 in VPC. Inside this VPC I have a private subnet and a public subnet. In the public subnet I have set up a NAT machine on a t2.micro instance that basically runs this NAT script on startup, injecting rules into iptables. Downloading files from the internet from a machine inside the private subnet works fine.

However I compared the download speed of a file on an external high-bandwidth FTP server directly from my NAT machine to the download speed from a machine inside my private subnet (via the same NAT machine). There was a really significant difference: around 10MB/s from the NAT machine vs. 1MB/s when downloading from the machine inside the private subnet.

There is no CPU usage on the NAT machine, so this cannot be the bottleneck. When trying the same test with bigger machines (m3.medium with "moderate network performance" and m3.xlarge with "high network performance"), I also could not get download speeds greater than 2.5MB/s.

Is this a general NAT problem that can (and should) be tuned? Where does the performance drop come from?

Update

With some testing, I could narrow this problem down. When I am using Ubuntu 12.04 or Amazon Linux NAT machines from 2013, everything runs smoothly and I get the full download speeds, even on the smallest t2.micro instances. It does not matter whether I use PV or HVM machines.
The problem seems to be kernel-related. These old machines have a Kernel version 3.4.x, whereas the newer Amazon Linux NAT machines or Ubunut 14.XX have Kernel version 3.14.XX. Is there any way to tune the newer machines?

Best Answer

We finally found the solution. You can fix the download speed by running on the NAT machine (as root):

ethtool -K eth0 sg off

This disables scatter-gather mode, which (as far as I understand this) stops offloading some network work on the network card itself. Disabling this option leads to higher CPU usage on the client as the CPU now has to do the work itself. However on a t2.micro machine we only saw around 5% of CPU usage when downloading a DVD image.

Note that this won't survive a restart, so make sure to set this in rc.local or at least before setting up NAT.