/var/log/messages showing lots of CE Err=0x2000 even on unused banks (slots)

centos5linux-kernelmemory

I recently had to upgrade an old server (HP ML350G5) with used FBDIMM DDR2 RAM (could not get new ones where I live). After rebooting /var/log/messages has been plagued with CE Err=0x2000 errors but system appears to be stable. I'm guessing ECC is taking care of things.

What doesn't make sense is that error logs show the same error on all banks even though I am only using 2 slots (slot 0 and 3).

Installed RAM is compatible 2x4GB Kingstone modules running on Centos 5.5 32 bit. I was waiting for some available downtime to install a PAE kernel to take advantage of the 8GB but I was not expecting the errors.

Other posts suggest on running the a memtest but I wanted to share and see if others have experienced similar errors pointing to unused RAM slots. Could the errors be related to having more RAM installed than what 32bit can take account for (without having a 64 bit or PAE kernel running)?

Error log sample follows.

Aug 14 21:00:35 umm kernel: EDAC MC0: CE row 0, channel 0, label "": (Branch=0 DRAM-Bank=4 RDWR=Read RAS=12405 CAS=506, CE Err=0x2000 (Correctable Non-Mirrored Demand Data ECC))

Aug 14 21:00:36 umm kernel: EDAC MC0: CE row 0, channel 0, label "": (Branch=0 DRAM-Bank=2 RDWR=Read RAS=3505 CAS=4, CE Err=0x2000 (Correctable Non-Mirrored Demand Data ECC))

Aug 14 21:00:37 umm kernel: EDAC MC0: CE row 0, channel 0, label "": (Branch=0 DRAM-Bank=6 RDWR=Read RAS=12404 CAS=504, CE Err=0x2000 (Correctable Non-Mirrored Demand Data ECC))


All DRAM-Bank= values appear on the logs (from 0 thru 7)

Best Answer

if you have not yet installed the PAE kernel then what kernel are you currently running???

the memtest may not identify the errors due to the memory being ECC memory

try running edac-util -v if there are any uncorrectable issues you will be able to identify the bad memory rows.