Switch – will heavy network traffic affect other connections on HP ProCurve V1810-48G

hp-procurvenetworkingswitchtraffic

I have a HP ProCurve V1810-48G switch with a few servers running Citrix XenServer connected to it (everything in one rack). The switch is almost in its default configuration (no VLAN, no port mirroring/monitoring, no other routers connected other than the gateway to the internet).

During copying of a few hundred GByte of data from server27 to a NFS-mounted directory on server18 I noticed network related error messages for other servers in the same rack, as if they were no longer able to send/receive traffic to each other or their users, such as error messages from external web monitoring services that a particular website is no longer reachable.

After cancelling the copy command everything was normal again.

Note that all of the mentioned servers are connected to the same switch and are located in the same IP network. I always thought that a connection between two servers on one switch will not affect any other server connected to the switch.

I have then hooked up the switch to a zabbix monitoring server. Here is the screenshot:
switch network traffic diagram
You can see here that the outgoing traffic from server27 (bottom right) to server18 (second row left) seems to affect every single server in the rack.
I have also suspended the copy process once, and you can see the drop in network traffic for everybody else.

Also you can see gaps in the diagrams where the zabbix server (server21) was unable to connect to the switch.

Checking the network traffic on the server side (instead of the switch side) showed that there is only the normal traffic, not the huge volume shown on the diagrams above.

Some commenters have pointed out that traffic between two ports in a switch should not affect any other port. This diagram however suggests that there is a subtile problem somewhere. The traffic of just 20 MByte/s affects the connectivity to all other systems.

Best Answer

It's not too far from "have you tried turning it on and off", but have you updated the firmware? If you look at the release notes, there are a few ARP-related fixes.

https://h10145.www1.hp.com/downloads/SoftwareReleases.aspx?ProductNumber=J9660A

As far as gathering more information so people can help troubleshoot, do you have logs from the switch itself when this is happening?

Can you share what configuration changes, aside from management information, have been made from the default state?

Are either of the hosts in question running Xen server? Do you see the problem between any other hosts (now that you've got historical graphing, you should be able to check if this happens elsewhere)?

Related Topic