Zpool Resilvering Loop – How to Resolve

debianstoragezfszfsonlinuxzpool

I have the following zpool:

    NAME                        STATE     READ WRITE CKSUM
    zfspool                     ONLINE       0     0     0
      mirror-0                  ONLINE       0     0     0
        wwn-0x5000cca266f3d8ee  ONLINE       0     0     0
        wwn-0x5000cca266f1ae00  ONLINE       0     0     0

This morning the host experienced an event (still digging into it. Load was very high and lots of stuff wasn't working, but I could still get into it).

On reboot the host hung during boot waiting on services that relied on data on the above pool.

suspecting an issue with the pool, I removed one of the drives and rebooted again. Host came online this time.

A scrub showed all the data on the existing disk was fine. After that completed, I reinserted the drive that was removed. The drive began resilvering, but gets about 4% through and then restarts.

smartctl shows no issues with either drive (No errors logged, WHEN_FAILED empty).

However, I can't tell which disk is resilvering, and in fact it looks like the pool is fine and doesn't need resilvered at all.

errors: No known data errors
root@host1:/var/log# zpool status
  pool: zfspool
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Sun Dec  8 12:20:53 2019
        46.7G scanned at 15.6G/s, 45.8G issued at 15.3G/s, 5.11T total
        0B resilvered, 0.87% done, 0 days 00:05:40 to go
config:

        NAME                        STATE     READ WRITE CKSUM
        zfspool                     ONLINE       0     0     0
          mirror-0                  ONLINE       0     0     0
            wwn-0x5000cca266f3d8ee  ONLINE       0     0     0
            wwn-0x5000cca266f1ae00  ONLINE       0     0     0

errors: No known data errors

What is the best course to get out of this resilvering loop? Other answers suggest detaching the drive that is being resilvered, but like I said, it doesn't look like either one is.

edit:

zpool events is about 1000 of the following repeated:

Dec  8 2019 13:22:12.493980068 sysevent.fs.zfs.resilver_start
        version = 0x0
        class = "sysevent.fs.zfs.resilver_start"
        pool = "zfspool"
        pool_guid = 0x990e3eff72d0c352
        pool_state = 0x0
        pool_context = 0x0
        time = 0x5ded4d64 0x1d7189a4
        eid = 0xf89

Dec  8 2019 13:22:12.493980068 sysevent.fs.zfs.history_event
        version = 0x0
        class = "sysevent.fs.zfs.history_event"
        pool = "zfspool"
        pool_guid = 0x990e3eff72d0c352
        pool_state = 0x0
        pool_context = 0x0
        history_hostname = "host1"
        history_internal_str = "func=2 mintxg=7381953 maxtxg=9049388"
        history_internal_name = "scan setup"
        history_txg = 0x8a192e
        history_time = 0x5ded4d64
        time = 0x5ded4d64 0x1d7189a4
        eid = 0xf8a

Dec  8 2019 13:22:17.485979213 sysevent.fs.zfs.history_event
        version = 0x0
        class = "sysevent.fs.zfs.history_event"
        pool = "zfspool"
        pool_guid = 0x990e3eff72d0c352
        pool_state = 0x0
        pool_context = 0x0
        history_hostname = "host1"
        history_internal_str = "errors=0"
        history_internal_name = "scan aborted, restarting"
        history_txg = 0x8a192f
        history_time = 0x5ded4d69
        time = 0x5ded4d69 0x1cf7744d
        eid = 0xf8b

Dec  8 2019 13:22:17.733979170 sysevent.fs.zfs.history_event
        version = 0x0
        class = "sysevent.fs.zfs.history_event"
        pool = "zfspool"
        pool_guid = 0x990e3eff72d0c352
        pool_state = 0x0
        pool_context = 0x0
        history_hostname = "host1"
        history_internal_str = "errors=0"
        history_internal_name = "starting deferred resilver"
        history_txg = 0x8a192f
        history_time = 0x5ded4d69
        time = 0x5ded4d69 0x2bbfa222
        eid = 0xf8c

Dec  8 2019 13:22:17.733979170 sysevent.fs.zfs.resilver_start
        version = 0x0
        class = "sysevent.fs.zfs.resilver_start"
        pool = "zfspool"
        pool_guid = 0x990e3eff72d0c352
        pool_state = 0x0
        pool_context = 0x0
        time = 0x5ded4d69 0x2bbfa222
        eid = 0xf8d

...

Best Answer

This is now resolved.

The following issue on github provided the answer:

https://github.com/zfsonlinux/zfs/issues/9551

The red flag in this case is probably the rapidly looping "starting deferred resilver" events as seen in zpool events -v

The first suggestion in the link was to disable the zfs-zed service. In my case, it was not enabled to begin with.

The second suggestion was verifying that the zpool had the defer_resilver feature activated. It seems there is a potential issue when a pool is upgraded without the features being enabled that correspond to that upgrade. This pool has moved from multiple machines/operating systems in the past 2 years or so, so it makes sense that it may have been created in an older version of ZFS and is on a newer version of ZFS on the most current host:

root@host1:/# zpool get all | grep feature
...
zfspool  feature@resilver_defer         disabled                       local
...

After seeing this, I enabled the feature. The github link seemed to suggest this was dangerous, so make sure you have backups.

root@host1:/# zpool set feature@resilver_defer=enabled zfspool

After that, zpool status showed the resilver progressing further than it had before:

root@host1:/# zpool status
  pool: zfspool
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Sun Dec  8 13:53:43 2019
        847G scanned at 2.03G/s, 396G issued at 969M/s, 5.11T total
        0B resilvered, 7.56% done, 0 days 01:25:14 to go
config:

        NAME                        STATE     READ WRITE CKSUM
        zfspool                     ONLINE       0     0     0
          mirror-0                  ONLINE       0     0     0
            wwn-0x5000cca266f3d8ee  ONLINE       0     0     0
            wwn-0x5000cca266f1ae00  ONLINE       0     0     0

errors: No known data errors

Related Solutions

Why did the zpool replace never finish and what should I do now

Try:

zpool detach BearCow da1

See if it spits out any error messages or resolves the issue.

This should automatically happen when the resilvering is done, but it looks like yours hung for some reason. There's additional measures that can be taken if this doesn't work. It should work, but it also shouldn't be necessary in the first place.

Debian ZFS – Resolving Endless Resilvering Issue

Congratulations and uh-oh. You've stumbled across one of the better things about ZFS, but also committed a configuration sin.

First, since you are using raidz1, you only have one disk worth of parity data. However, you had two drives fail contemporaneously. The only possible result here is data loss. No amount of resilvering is going to fix that.

Your spares helped you out a little bit here and saved you from a completely catastrophic failure. I'm going to go out on a limb here and say that the two drives that failed did not fail at the same time and that the first spare only partially resilvered before the second drive failed.

That seems hard to follow. Here's a picture:

sequence of events

This is actually a good thing because if this were a traditional RAID array, your entire array would have simply gone offline as soon as the second drive failed and you would have NO chance of an in-place recovery. But because this is ZFS, it can still run using the pieces it has and simply returns block or file level errors for the pieces it doesn't.

Here is how you fix it: Short-term, get a list of damaged files from zpool status -v and copy those files from backup to their original locations. Or delete the files. This will allow the resilver to resume and complete.

Here is your configuration sin: you have way too many drives in a raidz group.

Long term: you need to reconfigure your drives. A more appropriate configuration would be to arrange the drives in to small groups of 5 drives or so in raidz1. ZFS will automatically stripe across those small groups. This significantly reduces the resilver time when a drives fails because only 5 drives need to participate instead of all of them. The command to do this would be something like:

zpool create tank raidz da0 da1 da2 da3 da4 \
                  raidz da5 da6 da7 da8 da9 \
                  raidz da10 da11 da12 da13 da14 \
                  spare da15 spare da16

Best Answer

Related Solutions

Why did the zpool replace never finish and what should I do now

Debian ZFS – Resolving Endless Resilvering Issue

Related Topic