Debian – ZFS pool reports a missing device, but it is not missing

debianzfszfsonlinux

I am running the latest Debian 7.7 x86 and ZFS on linux

After moving my computer to a different room.
If I do a zpool status I get this status :

  pool: solaris
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid.  Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: http://zfsonlinux.org/msg/ZFS-8000-4J
scan: none requested
config:

NAME                                            STATE     READ WRITE CKSUM
solaris                                         DEGRADED     0     0     0
  raidz1-0                                      DEGRADED     0     0     0
    11552884637030026506                        UNAVAIL      0     0     0  was /dev/disk/by-id/ata-Hitachi_HDS723020BLA642_MN1221F308BR3D-part1
    ata-Hitachi_HDS723020BLA642_MN1221F308D55D  ONLINE       0     0     0
    ata-Hitachi_HDS723020BLA642_MN1220F30N4JED  ONLINE       0     0     0
    ata-Hitachi_HDS723020BLA642_MN1220F30N4B2D  ONLINE       0     0     0
    ata-Hitachi_HDS723020BLA642_MN1220F30JBJ8D  ONLINE       0     0     0

The disk it says in unavailable is /dev/sdb1
After a bit of investigating, I found this out, that the ata-Hitachi_HDS723020BLA642_MN1221F308BR3D-part1 is just a smiling to /dev/sdb1, and it does exist :

lrwxrwxrwx 1 root root 10 Jan  3 14:49 /dev/disk/by-id/ata-Hitachi_HDS723020BLA642_MN1221F308BR3D-part1 -> ../../sdb1

If I check smart status, like :

# smartctl -H /dev/sdb
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-4-amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

The disk is there. I can do fdisk on it, and everything else.

If I try to detach it, like :

zpool detach solaris 11552884637030026506
cannot detach 11552884637030026506: only applicable to mirror and replacing vdevs

I also tried with /dev/sdb /dev/sdb1 and the long by-id name. Same error all the time.

I can't replace it either, or what seems anything else. I have even tried to turn the computer off and on again, to no avail.

Unless I actually replace the hard disk it self, I can't see any solution to this problem.

Ideas ?

[update] balked

# blkid 
/dev/mapper/q-swap_1: UUID="9e611158-5cbe-45d7-9abb-11f3ea6c7c15" TYPE="swap" 
/dev/sda5: UUID="OeR8Fg-sj0s-H8Yb-32oy-8nKP-c7Ga-u3lOAf" TYPE="LVM2_member" 
/dev/sdb1: UUID="a515e58f-1e03-46c7-767a-e8328ac945a1" UUID_SUB="7ceeedea-aaee-77f4-d66d-4be020930684" LABEL="q.heima.net:0" TYPE="linux_raid_member" 
/dev/sdf1: LABEL="solaris" UUID="2024677860951158806" UUID_SUB="9314525646988684217" TYPE="zfs_member" 
/dev/sda1: UUID="6dfd5546-00ca-43e1-bdb7-b8deff84c108" TYPE="ext2" 
/dev/sdd1: LABEL="solaris" UUID="2024677860951158806" UUID_SUB="1776290389972032936" TYPE="zfs_member" 
/dev/sdc1: LABEL="solaris" UUID="2024677860951158806" UUID_SUB="2569788348225190974" TYPE="zfs_member" 
/dev/sde1: LABEL="solaris" UUID="2024677860951158806" UUID_SUB="10515322564962014006" TYPE="zfs_member" 
/dev/mapper/q-root: UUID="07ebd258-840d-4bc2-9540-657074874067" TYPE="ext4" 

After disabling mdadm and rebooting, this issue is back
Not sure why sdb is marked as linux_raid_member. How to clear that ?

Best Answer

Just run a zpool clear solaris then post the result of zpool status -v.

It would be nice to know the hardware involved and what controller you're using.


edit

Looking at your blkid output, you have remnants of a previous Linux software RAID. You'll need to mdadm --zero-superblock /dev/sdb1 to clear that.

Related Topic