Rsync size is difference from source to destination

ext4rsyncxfs

I'm using rsync with the options

-r for recursive
-l copy symlinks as symlinks
-t preserve modification time
-D preserve devices and specials
-v verbose
--prune-empty-dirs

The source FS is ext4 and the destination is XFS. I've copied few hundred folders that range between few hundred gigs to few TB and they are all within within less than 1GB size difference. However This particular folder is 264GB on source and once I rsync it across it is 286GB. That is a huge difference and I don't know what is wrong with it.

If the source ext4 FS has some corruption, is it possible that it isn't reporting the correct disk usage? I'm using 'du -skh'.

I've deleted the whole thing and restarted it 3 times and it yields the same results.

Best Answer

The most likely cause is hard links. Rsync by default turns 2 hardlinked files into duplicate files on the target taking up twice the disk space. If you want to preserve hard links add the -H/--hard-links option.

The next most likely issue is sparse files. Rsync by default does not write any files as sparse files even if they are on the source (it can't actually tell). If you have sparse files (most commonly used as virtual machine images and incomplete p2p downloads) then you will want to use the --sparse option.