Best practice on hot-swapping disk

hotswaphp-prolianthp-smart-arrayraid

I've been putting myself to find an learn more about the product before posting a question. Well, I've tried my luck and didn't got a proper answer to my questions. So I'm posting it here.

Had an issue with the RAID due to a disk failure, replaced it and the Array configuration utility showed that is ok now. However, it later showed that slot 11 is rebuilding, so does the overall array. So I kept the server for two days to rebuild, but at the end of the process it simply said; "interim recovery failed". Knowing that I've ordered another two disk so that I can get it replaced.

Having said that; I've couple of questions to clear;

  • One of the disks that I have is a seagate raw SCSI disk which matches the same model number as the server currently has? Will it work if I hot-swap it by putting it into the disk cage (cage I've took from previously removed disk)?
  • When I check the array through ACU offline utility, I see an option to erase each drives. Can I try erasing drive in slot 11, remove it and replace it or any ways to rebuild.
  • Meanwhile, I do have the existing drive which I replaced on slot 1 as said above, can I replace it atleast. ADU report is attached for further understanding.

I've set-up iLO management but I don't see anything related to the raid controller its just the basic. What can I do about it?

Our ESXi is not been obtained by HP. So I can't even see the array using CLI also it doesn't show on vSphere too. What can be the cause to this "interim recovery failed" error. I'm totally lost with this & its pretty difficult to search for proper article on HP site. Part number for the disk in the server is: 461289 – 001 (1TB SAS disk).

Please advise on this.


EDIT

HP Model: HP Proliant DL180 G6 // RAID 6 // P410 Smart Array

Where you see interim failure: I booted the server using HP's ACU to check the raid with more expanded information. It showed that it was rebuilding on slot 11. Therefore, I kept the server for two days on it without giving the server any more hassle. Once it was 100%, it suddenly showed that its failed & interim recovery failure on the array of the said disk. Refer this post to see the attached adu report as I don't any option to attach here.

ESXi Version: 5.0.0 // Build, 469512

enter image description here

Best Answer

This is from your Array Diagnostics Utility report...

enter image description here

You have one disk being rebuilt. Another disk is in prefailure mode. You can attempt to rebuild again by just reinserting disk 11 and letting it try once more. Are you absolutely sure this is RAID 6 (ADG)? You didn't mention "ADG", and I wanted to clarify.