Media Error Count – Understand Media Error Count in LSI MegaRaid

lsimegaraidraidwindows

So I got this server and if has a visual alert on one of the drive.
After further investigating with storcli I notice that is has a few media error counts.

I did some investigating about this and apparently these sorts of error are uncorrectable damage on a disk sector so it is remapped in order to not use said sector.
But apparently almost all drives have these sort of errors ads they are also caused by manufacturing imperfections.

So I have several questions:

  1. If this is normal why is a visual alert shown just for a few errors?
  2. How can I view more detail about these errors using storcli
  3. Can someone explain to me what are the other error count and shield count

If I'm misunderstanding something please explain it to me.

Thank you

Best Answer

I confirm that media error counts means a physical sector gone bad, generally discovered during an application read or array scrub.

In this context, "a sector gone bad" means that the physical disk was incapable of reading the original sector, returning an error to the RAID controller. The fact that the HDD itself can mark the sector as "to-be-remapped" is transparent to the RAID controller, which will simply try to re-write the same sector by using data from the other mirror leg/parity. If this re-write fails (meaning no availability of spare sectors from the drive itself), the disk is generally marked as failed.

If sporadic, such read errors are not too alarming and in fact most RAID controller mark a disk as bad only after some error threshold is crossed. In other words, 1 media error will simply be reported, while 100+ errors will definitely also mark the disk as bad (or as "predicted to fail soon" state).