Linux – How to run a command once a ZFS scrub *completes*

linuxmonitoringzfs

I would like to use cron to schedule periodic scrubs of my ZFS pool, and at some reasonably short time after the scrub finishes, email a status report to myself. The purpose of this is to catch any problems without having to manually look for them (push rather than pull).

The first part is easy: just set up a cron job to run zpool scrub $POOL as root at whatever interval is reasonable in my particular situation.

The second part, I'm not quite so sure how to do. zpool scrub returns immediately and then the scrub is run in the background by the system (which is certainly desirable behavior if the scrub is initiated by an administrator from a terminal). zpool status gives me a status report and exits (with exit code 0 while the scrub is running; it hasn't finished yet so I don't know if the exit status changes once it's done, but I doubt it). The only parameter documented for zpool scrub is -s for "stop scrubbing".

The main problem is detecting the change of status from scrubbing to finished scrubbing. Given that, the rest should fall into place.

Ideally, I'd want to tell zpool scrub to not return until the scrub finishes, but I don't see any way to make it do that. (It would make it almost too easy to simply cron zpool scrub --wait-until-done $POOL; zpool status $POOL.)

Failing that, I'd like to ask the system whether a scrub is currently in progress, preferably in a way that doesn't too much risk breaking with an upgrade or configuration change, so that I can act on whether or not a previously running scrub has finished (by executing a zpool status when the scrub status goes from scrubbing to not scrubbing).

This particular setup is for a workstation system, so while a monitoring tool such as Nagios probably has add-ins that would solve the problem, it feels rather overkill to install such a tool for just this one task. Can someone suggest a lower-tech solution to the problem?

Best Answer

On ZFS On Linux, starting with version 0.6.3 this can be handled quite elegantly by using the ZFS Event Daemon (zed). The event daemon, by virtue of monitoring the kernel events directly, can react almost immediately to any events that take place and does not depend on continuous polling and parsing of some other command's output.

Create a shell script with any file name that begins with /etc/zfs/zed.d/scrub.finish (for example, scrub.finish-custom.sh). That script can take any appropriate action, such as sending an email, writing a log entry somewhere, or making the system sing and dance (OK, maybe not that). Examples are provided that can provide a starting point.

If all you want is to receive an email when the scrub completes, the provided scrub.finish-email.sh script will do that nicely. Simply edit /etc/zfs/zed.d/zed.rc to indicate to where the email should be sent and whether an email should be sent also if the pool is not experiencing any problems, make sure something named scrub.finish followed by anything in /etc/zfs/zed.d leads to it, and make sure zed is started on boot.

Related Topic