More accurate way to obtain ZFS filesystem size

compressionfilesystemszfs

I've been having trouble backing up a ZFS filesytem due to my backup devices running out of space.

The first time I forgot that compression was enabled. However, for the second attempt, I got the compression ratio of the file system, and also got the apparent size as returned by du.

These agreed with each other on the size:

louis@watson:~$ sudo zpool list
NAME     SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
watson  3.62T  2.74T   904G    75%  1.00x  ONLINE  -

louis@watson:~$ sudo zfs list 
NAME            USED  AVAIL  REFER  MOUNTPOINT
watson         2.74T   846G    30K  none
watson/gelato  2.73T   846G  2.67T  /data/gelato

louis@watson:~$ sudo zfs get compressratio watson/gelato
NAME           PROPERTY       VALUE  SOURCE
watson/gelato  compressratio  1.64x  -

I expected 2.67 TB compressed with a savings of 1.64x to fit on a 6 TB drive, as reaffirmed by du:

louis@watson:~$ cd /data/gelato/
louis@watson:/data/gelato$ sudo du -hs --apparent-size
4.6T    .

But I once again ran out of space during the backup and I'm not sure why. There was probably less than 1 TB left to copy when the 6 TB drive ran out of space. Still this is way over anything I can think would account for the discrepancy.

Snapshots:

louis@watson:~$ sudo zfs list -o space
NAME           AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
watson          846G  2.74T         0     30K              0      2.74T
watson/gelato   846G  2.73T     53.6G   2.67T              0          0
watson/home     846G  15.8G     57.3M   15.8G              0          0

Is there a better way to get the actual size of the data stored on a compressed ZFS filesystem?

I should mention I use rsync -avh to backup. On the destination drive I don't see the snapshots(…../.zfs/snapshot). Does that mean the snapshots aren't being copied?

Best Answer

Some things to consider:

  • Check the record size (with zfs list -o recsize watson/gelato) of both source and target. If you have lots of small data but a large record size, space is wasted. The other way round, space is also wasted because of headers and metadata, but usually the effect is not as noticeable. If you share the file system via SMB/CIFS, you can look at the difference using the Windows Explorer folder properties window.
  • Check sector alignment (ashift) on both drives and compare with the drive specifications (can be found in the technical data sheets of the drive model). Wrong aligment can negatively affect space of your pool (in this example he lost about 9%).
  • Check if the copies property has ever been set to a value of 1 or greater (this may have been set and disabled in the past, and copies would have been created for any newly written data in the period between).
  • Get more details about how the space is used with the properties usedbychildren, usedbydataset, usedbyrefreservation, and usedbysnapshots. They sum up to used, so it will be nothing new, but could help to indentify old snapshots and the like.
  • To see the amount of space used by data and metadata as if compression was deactivated, check the properties logicalused and logicalreferenced
  • Because of differences in specifying data sizes (base 2 vs. base 10) your 6 TB drive actually has only about 5.457 TiB (9% less than assumed).