How does ZFS Block Level Deduplication fit with Variable Block Size

blockdeduplicationfilesystemszfs

According to The First Google Result for "ZFS Deduplication"


What to dedup: Files, blocks, or bytes?

Block-level dedup has somewhat higher overhead than file-level dedup when whole files are duplicated, but unlike file-level dedup, it handles block-level data such as virtual machine images extremely well.

ZFS provides block-level deduplication

According to Wikipedia's ZFS Article

ZFS uses variable-sized blocks of up to 128 kilobytes. The currently available code allows the administrator to tune the maximum block size used as certain workloads do not perform well with large blocks. If data compression (LZJB) is enabled, variable block sizes are used. If a block can be compressed to fit into a smaller block size, the smaller size is used on the disk to use less storage and improve IO throughput (though at the cost of increased CPU use for the compression and decompression operations).

I want to make sure I understand this correctly.

Assuming compression is off

If I a randomly filled file of 1GB, then I write a second file that is the same except half way through, I change one of the bytes. Will that file be deduplicated (all except for the changed byte's block?)

If I write a single byte file, will it take a whole 128 kilobytes? If not, will the blocks get larger in the event the file gets longer?

If a file takes two 64kilobyte blocks (would this ever happen?), then would an identical file get deduped after taking a single 128 kilobyte block

If a file is shortened, then part of its block would have been ignored, perhaps the data would not be reset to 0x00 bytes. Would a half used block get deduped?

Best Answer

ZFS deduplication works on blocks (recordlength) it does not know/care about files. Each block is checksummed using sha256 (by default changeable). If the checksum matches an other block it will just reference the same record and no new data will be written. One problem of deduplication with ZFS is that checksums are kept in memory so large pools will require a lot of memory. So you should only apply reduplication when using large record length

Assuming recordlength 128k

If I a randomly filled file of 1GB, then I write a second file that is the same except half way through, I change one of the bytes. Will that file be deduplicated (all except for the changed byte's block?)

Yes only one block will not be duplicated.

If I write a single byte file, will it take a whole 128 kilobytes? If not, will the blocks get larger in the event the file gets longer?

128k will be allocated, if the file size grows above 128k more blocks will be allocated as needed.

If a file takes two 64kilobyte blocks (would this ever happen?), then would an identical file get deduped after taking a single 128 kilobyte block

A file will take 128k the same file will be deduplicated

If a file is shortened, then part of its block would have been ignored, perhaps the data would not be reset to 0x00 bytes. Would a half used block get deduced?

If the exact same block is found yes