Linux – Backup and Restore Using dd and gzip

backupcloneddgziplinux

I've seen various posts discussing the use of dd for creating an image of a drive and only storing 'used data'. Before posing the problem/question, let's assume a few things.

Assumptions

  1. The drive to clone/image is /dev/sda
  2. /dev/sda is 10TBs
  3. Used space on /dev/sda is 1TB
  4. Storage of the image is to some remote CIFS mounted location

Question/Problem

Using something like cp with the --sparse=always option in conjunction with dd should produce a sparse file so that the file appears as 1GB:

cp --sparse=always <(dd if=/dev/sda bs=8M) /mnt/remote/location/disk.img

Alternatively something like below, should compress all zeroed space:

dd if=/dev/sda1 | gzip -c > /mnt/remote/location/disk.img.gz

So, what is the impact of a sparse image file upon restore? Will the transferred data be 1GB, or 10GBs including the perceived empty/zeroed space? This is obviously a consideration for assessing potential network load and time-to-restore.

P.S. I understand there are other options such as Clonezilla and something like ddrescue will allow resume capability but the question is specifically about using dd in the context above.

Thanks.

Best Answer

Writing to a Windows CIFS share SMB1

The word from Microsoft is: "In Windows NTFS file systems, files are not made sparse by default. The application or user needs to explicitly mark the file sparse via the FSCTL_SET_SPARSE control code." Unfortunately Linux doesn't mark these files via SMB1. Reportedly if you first make the file sparse on the Windows side (with Cygwin dd if=/dev/zero of=BigFile bs=1M count=1 seek=150000), then you can continue to write it as sparse from Linux. I believe the reading will be unoptimized.

Experiments

With RHEL6 coreutils-8.4 the cp --sparse=always local_file /mnt/cifs/file_on_cifs doesn't write a sparse file. When reading a CIFS file, it reads the zero'd areas (no fiemap optimization). In RHEL6 both backup and restore will transfer the entire file via network; better gzip it.

Same situation with coreutils-8.25 on Ubuntu 14x.

Writing to a Windows CIFS share SMB2/SMB3

There is a 2014 patch "Add sparse file support to SMB2/SMB3 mounts", so hopes are sparse files will be supported on mounted shares of Windows 8.1 and other platforms.

Writing to a Linux CIFS share

When you mount on Linux client a Samba share from some Linux server you can make write sparse files even on SMB1. There is no reading optimization.

Related Topic