3ware 9550SX RAID-10, one degraded drive, rebuild and initialization stuck



Is there a way to force this to rebuild? I'm also toying with the idea of turning the system off and attempting to rebuild it in the 3ware controller BIOS. If I turn this system off in the present state will it come back up or will the arrays be broken and not bootable? Presently the system is up and working.


Came in to one bad array (degraded) and the other three are initializing. I replaced the bad disk and attempted to rebuild. Using this commands:

./tw_cli /c3/p1 remove
./tw_cli /c3 rescan
./tw_cli maint rebuild c3 u0 p1

RAID array says it's rebuilding but has not moved since I issued the rebuild command.

~ # ./tw_cli /c3/u0 show

Unit     UnitType  Status         %RCmpl  %V/I/M  Port  Stripe  Size(GB)
u0       RAID-10   REBUILDING     29%     -       -     256K    1862.61   
u0-0     RAID-1    REBUILDING     0%      -       -     -       -         
u0-0-0   DISK      OK             -       -       p0    -       465.651   
u0-0-1   DISK      DEGRADED       -       -       p1    -       465.651   
u0-1     RAID-1    INITIALIZING   62%     -       -     -       -         
u0-1-0   DISK      OK             -       -       p2    -       465.651   
u0-1-1   DISK      OK             -       -       p3    -       465.651   
u0-2     RAID-1    INITIALIZING   40%     -       -     -       -         
u0-2-0   DISK      OK             -       -       p4    -       465.651   
u0-2-1   DISK      OK             -       -       p5    -       465.651   
u0-3     RAID-1    INITIALIZING   16%     -       -     -       -         
u0-3-0   DISK      OK             -       -       p6    -       465.651   
u0-3-1   DISK      OK             -       -       p7    -       465.651   
u0/v0    Volume    -              -       -       -     -       1862.61

I've attempted to rebuild array with rebuild schedule both enabled and disabled:

~ # ./tw_cli /c3 show rebuild                    

Rebuild Schedule for Controller /c3
Slot    Day     Hour            Duration        Status
1       Sun     12:00am         24 hr(s)        enabled
2       Mon     12:00am         24 hr(s)        enabled
3       Tue     12:00am         24 hr(s)        enabled
4       Wed     12:00am         24 hr(s)        enabled
5       Thu     12:00am         24 hr(s)        enabled
6       Fri     12:00am         24 hr(s)        enabled
7       Sat     12:00am         24 hr(s)        enabled

And I have attempted with the verify schedule both enabled and disabled.

~ # ./tw_cli /c3 show verify

Verify Schedule for Controller /c3
Slot    Day     Hour            Duration        Status
1       Sun     12:00am         24 hr(s)        enabled
2       Mon     12:00am         24 hr(s)        enabled
3       Tue     12:00am         24 hr(s)        enabled
4       Wed     12:00am         24 hr(s)        enabled
5       Thu     12:00am         24 hr(s)        enabled
6       Fri     12:00am         24 hr(s)        enabled
7       Sat     12:00am         24 hr(s)        enabled

Also note that attempting to set ignoreECC to on errors out:

~ # ./tw_cli /c3/u0 show ignoreECC
/c3/u0 Ignore ECC policy = off 

~ # ./tw_cli /c3/u0 set ignoreECC=on
Setting Ignore ECC Policy on /c3/u0 to [on] ... Failed.
(0x09:0x0005): (0x09:0x0005): Input/output error

Edit 3/15/18:
I figured I'd write up what happened in case anyone else finds themselves in a similar situation. I have to say the stuck initialization is the part that really threw me for a loop. I know some RAID cards resync or verify the arrays once a week. (Or whenever you schedule them to.) I believe what happened is this went to resync and verify the arrays and one or more of the drives failed during the resyncing causing the 'initializing' to stop.

I emailed support for this RAID card. (dcsg.support@broadcom.com) They looked over the logs and diags and didn't find anything out of the ordinary.
Their suggestion ultimately was to: 'Update the firmware. Reboot after the upgrade. It might help getting it out of the paused state.'

I asked them if it was safe to update the firmware in the 'initializing' state and if they are sure it would be safe to reboot while it's in this state. They never replayed back to that email.

Seeing as I trust no one, I backed up all of the data and rebooted the machine. It came back up with two more bad disks. (They were bad disks on the initializing RAID1 arrays.) Luckily they were all on different RAID1 arrays so I could replace the bad disks. After it rebooted and rebuilt the arrays, they initialized, and everything is now working correctly.

So if you ever see this card stuck at 'initializing' I would backup the data, attempt a reboot, and pray that the bad disks are on different mirrors.

Good luck to all that may read this in the future!

Best Answer

I figured I'd write up what happened in case anyone else finds themselves in a similar situation. I have to say the stuck initialization is the part that really threw me for a loop. I know some RAID cards resync or verify the arrays once a week. (Or whenever you schedule them to.) I believe what happened is this went to resync and verify the arrays and one or more of the drives failed during the resyncing causing the 'initializing' to stop.

I emailed support for this RAID card. (dcsg.support@broadcom.com) They looked over the logs and diags and didn't find anything out of the ordinary. Their suggestion ultimately was to: 'Update the firmware. Reboot after the upgrade. It might help getting it out of the paused state.'

I asked them if it was safe to update the firmware in the 'initializing' state and if they are sure it would be safe to reboot while it's in this state. They never replayed back to that email.

Seeing as I trust no one, I backed up all of the data and rebooted the machine. It came back up with two more bad disks. (They were bad disks on the initializing RAID1 arrays.) Luckily they were all on different RAID1 arrays so I could replace the bad disks. After it rebooted and rebuilt the arrays, they initialized, and everything is now working correctly.

So if you ever see this card stuck at 'initializing' I would backup the data, attempt a reboot, and pray that the bad disks are on different mirrors.

Good luck to all that may read this in the future!