How to Debug L1 Auto Negotiation Issues on Cisco Switches

ciscocisco-nexus-5kethernetlayer1switch

One of our Linux hosts has started switching its gears recently. While checking the interface speed, I could see that the interface has switched down to 10Mbps. The NIC and Switch are capable of 1G and auto negotiation is enabled at both sides. I would like to debug the root cause of the issue. We are planning to perform the below mentioned actions.

  1. Perform the cable test to check/replace the Cat6 cable.
  2. Double check
    on the switch configurations. (Link capability and L1
    Advertisements coming out of Switch)

Given that tcpdump works from L2 and up there is no point in getting the tcpdump . Do you guys have any other pointers? Perhaps, syslog/dmseg ?

I have already tried to capture the L1 advertisements but the command is not supported. The switch platform we are using is Cisco 5596.

We generally use the command show controllers <interface> | include Autoneg Lnk Ptr abty to decode the fast link pulse messages but that is not supported in the 5596 switch. We have already confirmed that both NIC (via ethtool) and switch will support 10, 100 and 1G.

How do you approach issues and collect stats on issues at L1/Phy layer (especially link speed switching down from 1G to 10M)?

enter image description here

Best Answer

Likely it's the cable.

While there were some issues with the Autonegotiation standard very early on, they've been resolved for decades and anything from 2000+ can't be affected.

Generally, leave Autonegotiation enabled at all times. Most often, when you think you need to configure it manually it lands on your feet some time later.

When checking for cable or port issues it's a good idea to check the logged events and port error counters first. Frequent relinking and sub-speed linking indicate cable damage (can also be caused by poor cable quality or installation, or exceeded reach). FCS errors, runts, giants and such indicate general transmission problems.

I usually move production to another switch port (configure appropriately!) and patch cable(s) to solve the issue quickly. Then I check out the cable and test the port - 99.5% of the time it's the cable.

Generally, it's a good idea to implement some port monitoring so you'd get an early warning in case a port doesn't link as expected or accumulates errors.