Very poor network performance in HyperV connections routed via a VirtualSwitch vEthernet NAT interface

hyper-vlinux-networkingvirtual-machineswindows 10

My host (Laptop with Win10) has good wifi connectivity and achieves MB/s speeds. The VM guest, on the other hand, stalls in the 5 KB/s range…sometimes. How can I improve the performance of the guest internet traffic ?

  • Guest: ubuntu 16.04 LTS guest VM inside HyperV. Internet gateway is connected to a VNAT link created with powershell.
  • Host: Win10 pro v1803 (build 17134.345). Accesses internet via a physical wifi card. On a home network.

Network config

The guest does have internet connectivity through a Virtual Ethernet Adapter NAT on the host, but internet TCP connections are very slow (<5KB/s). The host itself is on wifi, and achieves MB/s connections commonly.

The VM is on two different IPV4 networks. I've not configured IPv6, and it is the only VM running.

  1. guest eth0 is on 192.168.20.200/24. This network is for local traffic between the windows host and the VM. The max bandwidth on that link is high, and matches expectations. Latency with the host is <1ms. So, no issues there. In HyperV, this interface is connected to a VSwitch with the host.

  2. guest eth1 is on 192.168.30.200/24 for internet access. In HyperV, that interface is connected to a NAT VSwitch, and provides internet connectivity to the VM. The bandwidth on that link, from the VM's perspective is very slow, i.e. ~5 KByte/s sustained download speeds. Latency, on the other hand, is similar to the host's, in the 8-9ms range for internet pings.

The VNat on the host was created with powershell, using steps described in this Microsoft article

Tried to disable offloading — didn't work

I have disabled offloading features on the guest to see if that boosted the max bandwidth, with the following commands:

for i in rx tx sg tso ufo gso gro lro rxvlan txvlan rxhash; do
    sudo ethtool --offload eth1 "$i" off
done

But it did not improve guest internet speeds. I did not notice any effect.

Tried host reboots / re-enable VNAT interface — didn't work

I've tried rebooting the host, and the VMs, disabling and re-enabling the NAT interface, but that didn't perceptibly improve anything either.

Other details:

  • In powershell, Get-NetAdapter reports the following basic info for the vSwitch:

      vEthernet (vNAT)          Hyper-V Virtual Ethernet Adapter #6          19 Up           00-15-5D-02-E8-0F        10 Gbps
    
  • TCP connections will quickly fall below 5KB/s. Apt-get can sit for hours on small packages:

      0% [3 InRelease 104 kB/109 kB 95%]                               2,889 B/s
      ...
      Fetched 323 kB in 1min 1s (5,280 B/s)
    
  • At home, speedtest-cli on the guest at home reports 1.91Mbps down, 13.02Mbps up. On the host it's 80Mbps down, 20Mbps up.

  • At the university, speedtest-cli on the guest reports 5.9Mbps down, 8.41Mpbs up. On the host, 112.53 Mbps down, 154.13 mbps up.

  • The guest kernel is Ubuntu 16.04 (xenial) Linux host 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux. This kernel appears to come builtin with drivers for hyperv if I am to believe this list.

Best Answer

I've found a plausible explanation for the slowdown. I think it's related to having little virtual memory available. -- the host is a ram-constrained environment (laptop).

I noticed that when I suffer from these low speeds, in task manager, the process Vmmem is in torpor, consuming a significant portion of available virtual memory (~GBs) and with relatively high cpu usage. I suspect that network buffers get tangled up in this mess, they get swapped out, or they are simply dropped because they can't be queued anywhere in memory.

I'm not exactly sure what the best way is to make Vmmem fix its state once it starts acting up. I tried liberating memory by closing all apps in the host, and in the VM. Also tried shutting all VMs, but it would keep spinning. Like I mentioned in the question, I also tried host reboots, but generally win10 host reboots keep the VMs up, so presumably the bad state would come back on fresh host boot too.

My workaround

One way to resolve this seems to be to soft shutdown the VM in Hyper-V, reboot the host, and then bring the VM back up. It probably doesn't fix the root cause (not enough swap?, not enough mem allocated in hyperv?), but at least it restores decent network speeds in-vm.

These network speeds I posted in the question are all over the place. I certainly wouldn't expect RX to be less than TX on my asymmetric home internet connection.