ZFS – Recover or repair a corrupted file in a snapshot from backup

data-recoveryzfszfsonlinux

A pool has suffered permanent data corruption to file data that's part of a snapshot. If the file data was part of the filesystem (and not part of any snapshot), I could simply recover the file from a suitable backup copy. How can I recover or repair (and clear errors reported by ZFS for) a file in a snapshot from a copy of the snapshot or a (partial1) copy of the pool?

1 Where the partial copy contains at least the affected snapshot and the previous snapshot also on the affected pool.

Example

Here's an easy-to-reproduce tho extremely contrived example:

From a (bash) shell prompt:

cd
mkdir zfs-test
for i in {1..2}; do dd if=/dev/zero of=zfs-test/tank-file$i bs=1G count=1 &> /dev/null; done

sudo zpool create tank1 ~/zfs-test/tank-file1
sudo zpool create tank2 ~/zfs-test/tank-file2

sudo zfs snapshot tank1@snapshot1
sudo sh -c 'zfs send tank1@snapshot1 | zfs receive -F tank2'

Create a text file /tank1/test-text-file with content that you can easily find in a hex editor. Here's what I used:

Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu. In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium. Integer tincidunt. Cras dapibus. Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a, tellus. Phasellus viverra nulla ut metus varius laoreet. Quisque rutrum. Aenean imperdiet. Etiam ultricies nisi vel augue. Curabitur ullamcorper ultricies nisi. Nam eget dui.

Again from a shell prompt:

sudo zfs snapshot tank1@snapshot2
sudo sh -c 'zfs send -i tank1@snapshot1 tank1@snapshot2 | zfs receive -F tank2'

Now you need to corrupt the file data. I used ht and I searched for "dui" and changed it to "duh".

You can confirm that the data is corrupted:

sudo zpool scrub tank1; sudo zpool status -v tank1
  pool: tank1
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://zfsonlinux.org/msg/ZFS-8000-8A
  scan: scrub repaired 0 in 0h0m with 1 errors on Sun Jan 11 20:16:30 2015
config:

        NAME                               STATE     READ WRITE CKSUM
        tank1                              ONLINE       0     0     1
          /home/kenny/zfs-test/tank-file1  ONLINE       0     0     2

errors: Permanent errors have been detected in the following files:

        tank1@snapshot2:/test-text-file

Best Answer

It's always better to use redundant pools instead of non-redundant pools (though not always possible). The issue above is not likely to happen on a redundant pool. And it's faster to clone a snapshot (to get a file from it) than to recreate it somewhere (if you, of course, have no complaints about faulty hardware).