I am switching machines and have attached the old hard drive (/dev/sda4
) to the new machine.
The old machine had a slightly smaller hard drive (720G
), compared to the new one (736G
), so I created a slightly larger partition as well.
So, I then ran rsync
to copy all the data to the new partition, as shown below:
linux-70e2:/ # time rsync -azprvl /mnt/external-disk/foo /media/sda4/
...
sent 169,237,139,987 bytes received 24,529 bytes 24,419,185.41 bytes/sec
total size is 190,542,953,489 speedup is 1.13
real 115m30.297s
user 112m13.068s
sys 3m59.996s
The data gets copied without errors.
However, when I do:
du -h -m -s /mnt/external-disk/foo /media/sda4/foo
I get:
162414 /mnt/external-disk/foo
181721 /media/sda4/foo
Could somebody please explain this massive difference? Why am I not getting the same results? This is driving me nuts for days now. There are a few other partitions as well and I'm getting similar discrepancies as well.
Both partitions are ext4
.
linux-70e2:/ # mount | grep sda4
/dev/nvme0n1p5 on /media/sda4 type ext4 (rw,relatime,data=ordered)
/dev/sda4 on /mnt/external-disk type ext4 (rw,nosuid,nodev,relatime,data=ordered,uhelper=udisks2)
To my knowledge, there is nothing wrong with both drives which are SSD-s. One of them is brand new. I've run e2fsck
on both of them.
In addition, I ran:
find -L /mnt/external-disk type/foo -type l
and this doesn't list any symlinks below the source directory.
This is not my first time using rsync
for this kind of thing, but I've never had this kind of issue before. Please, advise!
Best Answer
The discrepance is mostly likely caused by more sparsely-populated file on the old disk.
Anyway, let's first check that file and inode numbers are the same:
find <path> | wc -l
on both mountpoints. Is the number of file/directory the same?df -i
. Is the number of inodes the same?If the answer to both question is yes, than the difference can be explained by more sparsely file on the new disk. But what are sparse files? In short, sparse files are normal files which are smaller than they appear. This is possible thank to a feature of (relatively) modern filesystems which, instead to write all zeroes to a file, simply set a flag telling the system "this file (or part of) is full of zeroes, don't let me write them all".
By default,
du
reports the real space taken by the file, and not it apparent size. To show apparent size, usedu --apparent-size
(for other options, please see du manpage)For a practical example, you can create a sparse file using the command
truncate test.img -s 1G
. As reported byls
, the newly created file is 1 GB in size, but if you trydu -hs test.img
, you'll see a very, very small filesize (possibly even zero!). How it is possible? As stated above, modern filesystem sometime "lie" to the appliations, reporting back an allocated size which does not exists in reality. On the other sidedu -hs --apparent-size test.img
will print the same size asls
.As you start writing into a sparse file, the filesystem will dynamically allocate the required space. For example, issuing
dd if=/etc/services of=test.img conv=notrunc,nocreat
will write some data into the previously all-sparse test.img file. Now, runningdu -hs test.img
will report the ~600 KB allocated for data storage.An obvious, but very important implication is that sparse file support can only optimize for zero-filled files (or part of). The very same moment your write to a file, its allocated space begin to grow. This is true event if you write other zeroes to the file, unless the application know how to handle sparse files (in this case, the application will advise the filesystem that it is going to write all zeroes, and the filesystem optimize accordlying).
What if you want to really preallocate some space? Then you can use
fallocate test.img -l 1G
. If you executels; du -hs test.img; du -hs --apparent-size test.img
, you'll see that all tools report the very same size, because the file was really fully allocated by thefallocate
call.In short, it is possible that, during the copy, some file were recreated in a less sparsely manner, replacing sparse sections with "real" zeroes. To use sparse file with
rsync
you had to use the-S
option.