Which drive in RAID has bad sectors

bad-blockshardware-raidlsimegaraidraid

I have 4 physical drives in a single virtual drive using an LSI MegaRaid SAS controller. It seems (at least) one of the drives has bad sectors because:

  • io errors occur when attempting to back up some files
  • running badblocks reports some bad sectors

I'm hoping that resolving the issue will be as simple as swapping out the problematic disk(s) and rebuilding the raid array. I thought LSI MegaRaid WebBIOS would allow me to identify the problematic disk(s) but I can't find any options to check for bad sectors.

Below is a screenshot of the WebBIOS:
enter image description here

Could anyone offer any advice as to how the problematic disk(s) can be identified?

Best Answer

Smartmontools has extensions that allow it to poll a drive for SMART data through an LSI (as well as others) RAID array. Normally, this isn't something you can do as the RAID abstraction obscures direct interfaces with the drives.

Smartmontools might not be installed on your machine. However, it is native to most "main repositories" of most distributions, and there is even a Windows version at: http://sourceforge.net/projects/smartmontools/files/

It can be used to poll a drive behind an LSI MegaRAID controller like so:

smartctl -a -d megaraid,N /dev/sdX

Where "-a" means display all disk data, -d means device type (megaraid being the type in your case), followed by N which means the drive number in that controller. To access the drive in slot 0, you would say 0 here. If you wish to poll all four of your drives, run this command four times, replacing N with 0 to 3. sdX is the RAID abstraction itself, as seen normally within the operating system. Yours is probably sda.

You will see a long output from each drive, and what you're looking for is either a reported general SMART failure (which you might not find, as your controller isn't rejecting drives), or reported "offline uncorrectable sectors" or "pending sectors". Any drive with more than 0 in this field is bad. No mercy should be given to those fields, as it takes a LOT of failed reads to increment either value by one.

You can also perform a short or long test like so (same rules above apply):

smartctl -t [long|short] -d megaraid,N /dev/sdX

Related Topic