ZFS report of all user’s usage

zfs

I have a ZFS file system with over 35TB of data with an unknown number of people who have added data. How can I get a report of each user's usage like output from zfs get userused@{uid} without knowing all the users that have data in the zfs file system?

Best Answer

You can do it with zfs userspace (and zfs groupspace), which shows all users that either have files inside a filesystem (meaning they are the current owner of the file) or a quota inside a filesystem (set by zfs set userquota) or both. Note that it is not recursive (similar to zfs list, so you might have to call it for each child filesystem.

Example shell script

#!/bin/ksh

# get all children file systems from given file system argument
filesystems=$(zfs list -Hro name "${1}")

# get all space usage line by line in the format
# <usedspace>\t<userquota>\t<username>\t<filesystem>\n
# (it can later be sorted by the first column)
for filesystem in ${filesystems}; do
    sizes=$(zfs userspace -Hp -o used,quota,name "${filesystem}")
    IFS=$'\n'

    # multi-line output of userspace makes inner loop necessary
    for size in ${sizes}; do
        printf "%s\\t%s\\n" "${size}" "${filesystem}"
    done
    unset IFS
done

Example output

root@host:/root# ./test.sh rpool | sort -rn
cannot get used/quota for rpool/dump: dataset is busy
cannot get used/quota for rpool/swap: dataset is busy
4273131520      none    root    rpool/ROOT/omnios-2
4197847040      none    root    rpool/ROOT/omnios-1
2681162240      none    root    rpool/ROOT/omnios
[...]

Manpage details with available options

 zfs userspace [-Hinp] [-o field[,field]...] [-s field]... [-S field]...
   [-t type[,type]...] filesystem|snapshot
   Displays space consumed by, and quotas on, each user in the specified
   filesystem or snapshot. This corresponds to the userused@user and
   userquota@user properties.

   -H  Do not print headers, use tab-delimited output.

   -S field
       Sort by this field in reverse order. See -s.

   -i  Translate SID to POSIX ID. The POSIX ID may be ephemeral if no
       mapping exists.  Normal POSIX interfaces (for example, stat(2), ls
       -l) perform this translation, so the -i option allows the output
       from zfs userspace to be compared directly with those utilities.
       However, -i may lead to confusion if some files were created by an
       SMB user before a SMB-to-POSIX name mapping was established. In
       such a case, some files will be owned by the SMB entity and some by
       the POSIX entity. However, the -i option will report that the POSIX
       entity has the total usage and quota for both.

   -n  Print numeric ID instead of user/group name.

   -o field[,field]...
       Display only the specified fields from the following set: type,
       name, used, quota.  The default is to display all fields.

   -p  Use exact (parsable) numeric output.

   -s field
       Sort output by this field. The -s and -S flags may be specified
       multiple times to sort first by one field, then by another. The
       default is -s type -s name.

   -t type[,type]...
       Print only the specified types from the following set: all,
       posixuser, smbuser, posixgroup, smbgroup.  The default is -t
       posixuser,smbuser.  The default can be changed to include group
       types.

 zfs groupspace [-Hinp] [-o field[,field]...] [-s field]... [-S field]...
   [-t type[,type]...] filesystem|snapshot
   Displays space consumed by, and quotas on, each group in the specified
   filesystem or snapshot. This subcommand is identical to zfs userspace,
   except that the default types to display are -t posixgroup,smbgroup.