Electronic – Interesting 100Mbit Ethernet failure

ethernetinterface

I have a PCB designed using both 10Mbit and 100Mbit Ethernet, I have produced 100 boards 97% of which work perfectly (both 100Mbit and 10Mbit). On 3% of my boards the 10Mbit operates all the time, but the 100Mbit fails on some interfaces. For example I have two Ethernet ports from my PC, one from the motherboard and a second PCI-X Ethernet card. On the motherboard port all devices operate correctly. On the PCI-X based port 10Mbits works and 100Mbits doesn't. The PCI-X Ethernet device obviously works for the 97% of the devices. If I use a switch between the PCI-X port and my device 100Mbit works all the time.

I'm concerned that I have a problem with my Ethernet interface causing it to be marginal. Has anybody seen this before or could somebody offer some hints of where to look for the problem?

UPDATED

I'm using the Micrel KSZ8041 Ethernet PHY.

When I find a failing PCB it doesn't work with other designs I have which I know are in spec (although thanks this was a good suggestion).

Here are my schematics:

On my failing units I have

  • Removed the ESD protection.
  • Added 1uF on the transformer center tap.
  • Adjusted all the filtering components.
  • Swapped most components between a working unit and a failing unit except the PHY and DSP.

My connector P300 is not an RJ45, instead it is a 2mm through hole header where a custom wire loom connects to an RJ45 on a separate PCB. I have excluded this header and the separate PCB from the equation by wiring directly to P300.

Not sure if it makes a difference but this works with a longer Ethernet cable (5m is okay 0.5m is not)…. which points to the matching components/layout. I've reviewed the grounding.

On the inside of the transformer the tx & rx are balanced and on an internal plane between two ground planes (the tx & rx pair are tightly coupled together (9 mils separation) and good spacing between the pairs & other tracks (at least 40 mils). On the outside of the transformer the tx and rx pairs run over a separate chassis gnd plane.

I'm convinced the problem is between the PHY and the outside world (not excluding the PHY). When I probe the rx0 line between the PHY and the DSP a working unit shows only traffic when I ping, but a failing unit has a constant stream of data (presumably idle characters incorrectly received).

I'm currently investigating renting some compliance testing equipment.

Best Answer

This is not a complete answer, but it's a good first step.

I recommend checking the PHY's mode switches. It's probably set to auto-negotiate, meaning the PHY will fall back on a slower data rate if the faster one proves marginal. If you can force the faster data rate, it'll be much easier to debug the problem.