I am having trouble replacing a disk on an existing zpool on a system running Solaris 10 on an x86 processor. The zpool was originally created with two mirrored slices. One of the drives failed, so I swapped it physically with a new drive. I ran prvtoc and fmthard to copy the disk label from the working drive onto the new drive:
prtvtoc /dev/rdsk/c1t0d0s2 >/tmp/c1t0d0s2.out
fmthard -s /tmp/c1t0d0s2.out >/dev/rdsk/c1t1d0s2
Then I tried to bring the new drive online and got a warning about the device still being faulted:
$ zpool online pool c1t1d0s6
warning: device 'c1t1d0s6' onlined, but remains in faulted state
The output of zpool status -v is:
NAME STATE READ WRITE CKSUM
pool DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
c1t0d0s6 ONLINE 0 0 0
c1t1d0s6 UNAVAIL 0 0 0 corrupted data
(c1t1d0 is the replaced drive.)
Then I brought c1t1d0 offline again and tried running the zpool replace command, but this did not work, either:
$ zpool replace pool c1t1d0s6
invalid vdev specification
use '-f' to override the following errors:
/dev/dsk/c1t1d0s6 overlaps with /dev/dsk/c1t1d0s2
Does anyone know what's going on? Is it safe to use the '-f' flag?
Edit: After running zpool replace -f, I get:
pool: pool
state: DEGRADED
status: The pool is formatted using an older on-disk format. The pool can
still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'. Once this is done, the
pool will no longer be accessible on older software versions.
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
pool DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
c1t0d0s6 ONLINE 0 0 0
replacing-1 UNAVAIL 0 0 0 insufficient replicas
c1t1d0s6/old OFFLINE 0 0 0
c1t1d0s6 UNAVAIL 0 342 0 experienced I/O failures
I see errors on the new drive in iostat -e output. I guess the new drive might be bad, too?
Edit 2: I don't know what's going on. I tried a different drive with the same procedure. After running the zpool replace -f, the zfs pool ran a scrub, but the status output is:
pool: pool
state: ONLINE
status: The pool is formatted using an older on-disk format. The pool can
still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'. Once this is done, the
pool will no longer be accessible on older software versions.
scrub: scrub completed after 12h56m with 0 errors on Wed Aug 29 06:49:16 2012
config:
NAME STATE READ WRITE CKSUM
pool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c1t0d0s6 ONLINE 0 0 0
replacing-1 ONLINE 5.54M 19.9M 0
c1t1d0s6/old UNAVAIL 0 0 0 corrupted data
c1t1d0s6 UNAVAIL 0 0 0 corrupted data
After offlining c1t1d0s6, the zpool status output is:
pool: pool
state: ONLINE
status: The pool is formatted using an older on-disk format. The pool can
still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'. Once this is done, the
pool will no longer be accessible on older software versions.
scrub: scrub completed after 12h56m with 0 errors on Wed Aug 29 06:49:16 2012
config:
NAME STATE READ WRITE CKSUM
pool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c1t0d0s6 ONLINE 0 0 0
replacing-1 ONLINE 5.54M 19.9M 0
c1t1d0s6/old UNAVAIL 0 0 0 corrupted data
c1t1d0s6 UNAVAIL 0 0 0 corrupted data
I don't get it. Shouldn't the system be able to replace c1t1d0s6 using the mirror on c1t0d0s6?
Best Answer
Did you clear the alerts in
fmadm
? And thezpool clear
... It's safe to run the zpool replace with the-f
switch, but I think your statement is wrong, unless you already removed the bad disk.http://docs.oracle.com/cd/E19253-01/819-5461/gbcet/index.html