I run into troubles after trying to add more disks to my Ubuntu server device. While being a total beginner I powered the server off, added two more disks and restarted the system only to find one of the disks in the existing mirror "FAULTED".
matsojala@amatson:~$ zpool status -v
pool: tank
state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: http://zfsonlinux.org/msg/ZFS-8000-4J
scan: scrub repaired 0B in 21h20m with 0 errors on Fri Feb 8 14:15:04 2019
config:
NAME STATE READ WRITE CKSUM
tank DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
sdb ONLINE 0 0 0
12086301109920570165 FAULTED 0 0 0 was /dev/sdb1
errors: No known data errors
I tried to export and import based on this answer (ZFS pool degraded on reboot) but exporting fails
matsojala@amatson:~$ sudo zpool export -f tank
umount: /tank: target is busy.
cannot unmount '/tank': umount failed
I'm not sure which way I should try to replace the disk as the disk on the system is "part of active pool".
matsojala@amatson:~$ sudo zpool replace -f tank 12086301109920570165 sdc1
invalid vdev specification
the following errors must be manually repaired:
/dev/sdc1 is part of active pool 'tank'
Tried this too.
matsojala@amatson:~$ sudo zpool replace tank sdb
/dev/sdb is in use and contains a unknown filesystem.
Any help? The disk was fully working before powering off, it is in the system named as /dev/sdc1 with ID "12086301109920570165". What should I do?
Thanks.
Best Answer
It looks like you've been using names like
/dev/sda
to reference disks. That's generally not a good idea, because if your disks get assigned different names after a reboot or an unplug-replug cycle, then ZFS can get confused. Instead, you should create your pool using the device files in/dev/disk/by-id/
,.../by-uuid/
, or.../by-label/
.In your case, I'm not totally certain, but it kind of looks like
/dev/sdb1
got relabeled to/dev/sdc1
after reboot, which is why/dev/sdc1
looks like it's part of the pool even though it doesn't appear inzpool status
. You could try to fix it by unplugging the extra disks you added -- that would probably allow the labels to go back to how they were originally -- and then doing anexport
followed byzpool import -d /dev/disk/by-id tank
, to force ZFS to relabel the pool based on theby-id
disk names.If the export doesn't work because it's busy, make sure no process is accessing files on the pool and try again. I am not a Linux user, but it appears there is also some configuration file you can use to help you do this during reboot: this post on Github suggests setting
USE_DISK_BY_ID='yes'
in/etc/default/zfs
to force it during reboot. Worst case you can set that and reboot -- reboot automatically runs export / import.That said, if you want to go through with replacing the disk anyway, the Oracle docs explain the "replace one faulted disk of a mirror" use case pretty well. (Just ignore the Solaris-specific instructions about unconfiguring the disk with
cfgadm
.) I think the main steps you missed were runningzpool offline tank <faulted disk>
before runningzpool replace tank <new disk>
.