What happens to a random bit error in the cache on an Intel CPU

cachecorruptioneccintelmemory

I have a system with ECC RAM and a Xeon E3 CPU.

My understanding is that ECC circuits on the RAM will detect corruption from random bit errors in the RAM chips.

But what happens to random bit errors inside the memory stored in an Intel CPU? e.g. the cache and/or registers?

Is there not a coverage hole where good RAM is cached into the CPU, this cached RAM is then corrupted, then used later by the CPU (without it checking the ECC RAM)?

I can not find any information on Intel website except for the top of the line Xeon E7's about cache ECC protection.

Does that mean any Intel CPU below the Xeon E7 line is vulnerable to memory corruption whether or not you use ECC RAM?

Best Answer

Everything what you wrote is true except you are not listing practical reasons for ECC correction. I recommend reading of article below. Now in practical application, systems use memory correction to actually increase performance because some hardware and software is capable of detecting inconsistencies in data and request reprocessing of transaction. Furthermore it is highly unlikely that common single-bit-error could affect your work. In fact it is more likely that overheating of any electronic chip on your computer could cause insulator to allow jump of an electron (a reason why overclocking causes computers to fail). Memory correction is very important in large scale computations that would not posses other means of correction, such as weather modeling of scientific computations. Anywhere where corrupt data would be repeated billion of times or where long floating point numbers are processed. For that reason, as far as I remember, all PileDriver and SteamRoller AMD cores, that can combine individual cores to process 256-bit long floating point numbers utilize ECC in CPU's memory.

Some reading here