Debian – Linux optical 10Gbe networking, how to diagnose performance problems

bondingdebianlacplinux-networkingsfp

I have a small cluster consisting of 3 servers. Each has two 10Gbe SFP+ optical network cards. There are two separate 10Gbe switches. On all servers one NIC is connected to switch 1, second NIC is connected to switch 2 to provide fault tolerance.

Physical interfaces are bonded on server level using LACP.

All servers can ping each other, but on one there is small (4%) packet loss (over bonded interface, which looks suspicious to me)

When I check with iperf3 transfer rates between two good servers, they show about 9.8Gbit/s transfer rates in both directions.

Those two good servers can also download from problematic one also about 9.8 Gbit/s

Iperf3 show strange thing when run as client on problematic server. It starts with a few hundred megabit in first turn. Later speed drops to 0 bit/s (while still running ICMP ping with ~96% success rate). Only in one direction.
When other servers download from this, they get full speed.

It's all running on a same hardware even firmware version is the same (Dell R620 servers, Mellanox ConnextX-3-EN NIC's, Opton SPF+ modules, Mikrotik CRS309-1G-8S switches). Also OS is the same latest stable Debian with all updates and exact installed packages.

There is no firewall, all iptables rules are cleared on all servers

On problematic server i check interfaces, both NIC's show UP and running at 10Gbit full duplex

Also cat /proc/net/bonding/bond0 show both interfaces UP, active, no physical link errors

I checked/replaced SFP+ modules, used different fiber patch cords, tried different switch ports and nothing changes, still this one problematic server get poor download speed from others and small packet loss (over bonded interface!).

I also tried patch cord combinations with: (both on, first on second off, first off second on). Also no change

Any ideas how can I diagnose it better?

Best Answer

Unless the switches support stacking and support LACP across chassis, LACP cannot work that way. In fact, static LAG trunking won't work either.

Generally, link aggregation only works with a single opposite switch (or a stack acting like it).

With simple L2 redundancy, you can only run the NICs in active/passive pairs with failover. Using multiple L3 links with appropriate load balancing and IP migration on failover or monitoring by an external load balancer will also work in your scenario.

Related Topic