Perc H700 disk rebuild keeps going offline

delldell-percdell-poweredgeraidraid6

I have a Dell Poweredge R710 with a Perc H700 Raid controller. It's in a RAID 6 configuration currently with 2 failed disks. It does boot to CentOS in it's currently degraded state.

I would like to replace the disks, but whenever I add a new one and run a rebuild it runs for a few minutes then puts the new disk offline. It does this for both slots on the controller and for two brand new disks (same model as existing).

I think the battery on the controller needs replacing, but I wouldn't think this would affect the ability to run a rebuild.

Any ideas on how to get a rebuild to complete?

Best Answer

You're most likely encountering read errors on one of the remaining online drives. If you have Dell's OpenManage Server Administrator installed, you can use it to export a controller log, which should contain details about any errors that occurred during the rebuild process.

Exporting a Controller Log in OpenManage Server Administrator:

  1. Expand the Storage tree in the left pane
  2. Select the H700
  3. Select the Information/Configuration sub-tab at the top
  4. Select "Export Log" from the Available Tasks drop-down menu
  5. Click Execute. The log name and location should be shown before executing 
  6. Click Export Log to complete the export process and save the file

If you don't have a GUI but do have OpenManage installed, you can also use this CLI syntax to export a controller log:

omconfig storage controller action=exportlog controller=id

where id is the controller ID number as reported by the omreport storage controller command.

The log file is exported to either /var/log or <install-directory>/sm, and is named lsi_<mmdd>.log where <mmdd> is the month and date.