What does “single-bit ECC errors were detected on the RAID controller” mean

dell-percdell-poweredgeeccmemoryraid

I have a Dell T7600 with a Perc H710P RAID controller and 4 attached 3TB drives. Over the past few months the RAID controller has been intermittently reporting errors on boot: "no boot device found", "adapter at baseport is not responding", disks frequently reported as missing or failed.

I have since replaced the RAID controller, the 4 hard drives, and finally the system's motherboard.

After replacing the motherboard and rebooting a few times, I got the error

Single bit ECC errors were detected on the RAID controller.
Please contact technical support to resolve this issue.

After rebooting about 20 more times, I haven't seen the ECC error. The system seems otherwise OK, except for the fact that the disk fans will sometimes start blowing at full blast when the the system is sitting completely idle and not stop until I reboot.

Are the ECC errors in memory on the RAID controller? Or, does the RAID controller map in system memory, and the ECC errors are really in system memory? Or, are the ECC errors in the 1GB cache that resides in the RAID controller?

Best Answer

This error corresponds to the cache module on the controller. At this point, you need to probably replace the RAM or the actual PERC controller. This should be standard warranty work.