Auto clean unused space on disks in VM

backupdisk-imagekvm-virtualization

I run KVM based virtualization server (namely, Proxmox) where some Debian based machines are run in KVM VM's. Proxmox can create backups of VMs, it also compress the VM disks images.

As we understand, backup sizes are increasing over time, since more data stored on each VM disk and more 'clean' blocks of VM disk become 'dirty' (that is, contains remaining of old files). So, even if I delete all files on such a virtual disk by rm -rf it, in fact the backup will be of the same size since this won't clear all the blocks of VM disk.

I can 'clear' the VM disk by doing something like dd if=/dev/zero of=/BIG.txt and then rm -f /BIG.txt – this way I create big file full of zeros that will use all of the disk space and after I delete it its ex-blocks will contain zeroes. The downside is that for a moment the disk become full which affects every program that want to write anything.

But maybe there are some other way to clear unused disk blocks with zeroes so backup will compress such a disk with better rate? Some Windows based programs offer options to 'clear unused disk space' (e.g. CCleaner), but I need that for Linux.

Best Answer

Recent libvirt/kvm versions support the discard vdisc option (for SCSI vdisk type only). With this option enabled, you can issue fstrim / on the guest fileststem and unused blocks will be immediately discarded by the host vm image, compacting/reducing it via hole punching.

See here (driver section, search for 'discard') and here for more information.

If you can't use the trim/discard method, you can continue using your current zeroing method (dd from /dev/zero), with a twist: issue two dd passes, each with only little more than half the free disk space, spaced out by a fsync; rm BIG.txt command. This should be enough to recover almost all the free space, without filling it all at once.