Centos – How to recover from raid 5 failure of 2 disks with tw_cli

I have hardware raid 5 of 12 disks, 2 of them died and the data is not accessible anymore.
I was told that even though 2 disks died, some of the data might be recoverable.
My hosting provider replaced the bad disks with new ones (at start they replaced functioning disk with new one, but now all in place).

I'm using tw_cli and I guess that now I need to "rebuild" to array, but I'm afraid of doing mistakes.
I didn't find any step-by-step guide for such case with tw_cli.

Can you please advise, what should be done now and what is the exact commands with tw_cli?

#tw_cli /c0/u0 show

Unit     UnitType  Status         %Cmpl  Port  Stripe  Size(GB)  Blocks
-----------------------------------------------------------------------
u0       RAID-5    INOPERABLE     -      -     256K    20489     42968510464 
u0-0     DISK      DEGRADED       -      -     -       1862.63   3906228224  
u0-1     DISK      OK             -      p1    -       1862.63   3906228224  
u0-2     DISK      OK             -      p2    -       1862.63   3906228224  
u0-3     DISK      OK             -      p3    -       1862.63   3906228224  
u0-4     DISK      OK             -      p4    -       1862.63   3906228224  
u0-5     DISK      OK             -      p5    -       1862.63   3906228224  
u0-6     DISK      OK             -      p6    -       1862.63   3906228224  
u0-7     DISK      OK             -      p7    -       1862.63   3906228224  
u0-8     DISK      OK             -      p8    -       1862.63   3906228224  
u0-9     DISK      OK             -      p9    -       1862.63   3906228224  
u0-10    DISK      OK             -      p10   -       1862.63   3906228224  
u0-11    DISK      DEGRADED       -      -     -       1862.63   3906228224

OS: CentOS

UPDATE:
As @Overmind suggested, I've inserted the disks again, it said rebuilding, now it says inoperable but 11 disks out of 12 is OK!!

I replaced the bad disk (p0) with a new one and tried to rebuild but it failed because device is busy. any idea what should I do?

tw_cli /c0/u0 start rebuild disk=0
Sending rebuild start request to /c0/u0 on 1 disk(s) [0] ... Failed.

(0x0B:0x0033): Unit busy

I tried to umount the folder on this raid array but it didn't help. In the manual I read that I should mark the disk as spare so I did it but I'm afraid I got bad results, I really need your help here.

tw_cli /c0 add type=spare disk=0
Creating new unit on controller /c0 ...  Done. The new unit is /c0/u1.

# tw_cli /c0 show

Unit  UnitType  Status         %Cmpl  Stripe  Size(GB)  Cache  AVerify  IgnECC
------------------------------------------------------------------------------
u0    RAID-5    INOPERABLE     -      256K    20489     OFF    ON       OFF      
u1    SPARE     OK             -      -       1863.01   -      OFF      -        

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     OK               u1     1.82 TB     3907029168    9WM0XF4D      
p1     OK               u0     1.82 TB     3907029168    53SB7TLAS     
p2     OK               u0     1.82 TB     3907029168    53SDBSXAS     
p3     OK               u0     1.82 TB     3907029168    53SB7UJAS     
p4     OK               u0     1.82 TB     3907029168    53SB7SGAS     
p5     OK               u0     1.82 TB     3907029168    53SB8BPAS     
p6     OK               u0     1.82 TB     3907029168    53VDW0PGS     
p7     OK               u0     1.82 TB     3907029168    53SDAHTAS     
p8     OK               u0     1.82 TB     3907029168    53SB7U3AS     
p9     OK               u0     1.82 TB     3907029168    53SB7UBAS     
p10    OK               u0     1.82 TB     3907029168    53VE7D5AS     
p11    OK               u0     1.82 TB     3907029168    43N2SNDGS     

Name  OnlineState  BBUReady  Status    Volt     Temp     Hours  LastCapTest
---------------------------------------------------------------------------
bbu   On           Yes       OK        OK       OK       0      xx-xxx-xxxx

Best Answer

3Ware controllers are nice - no doubt about that. But as noted above, RAID 5 with many disks is a real problem. If the disks are completely dead and gone, I would say you have no way of recovering, short of using a data recovery tool like this:

https://www.runtime.org/raid.htm

I have tried recovering data for customers (long time ago) and it is at best ridiculously time consuming. Even with the proper tools, with two disks gone, some data is irrecoverably lost. If just one of the two disks can be somewhat recovered, you might be in luck. That would allow reconstruction and as far as I recall, the 3Ware stuff is reasonably good at it.

All things considered, I hate to agree with the previous posters, but with two disks gone (and with that good disk having been replaced too), I would say your chances are pretty slim.

Given the relatively low disk prices these days (not including SSDs), go for at least RAID 6 with a hot spare next time. The best option is RAID 10 with hot spare(s) as it gives you (up to) 50% failure tolerance and great speed on top.

Best Answer

Related Solutions

Linux – Cannot recover from failed RAID

Disk failed part way through 3ware RAID 5 rebuild

Related Topic