For large files compress first then transfer or rsync -z? which would be fastest

backupcompressionrsync

I have a ton of relativity small data files but they take up about 50 GB and I need them transferred to a different machine. I was trying to think of the most efficient way to do this.

Thoughts I had were to gzip the whole thing then rsync it and decompress it, rely on rsync -z for compression, gzip then use rsync -z. I am not sure which would be most efficient since I am not sure how exactly rsync -z is implemented. Any ideas on which option would be the fastest?

Best Answer

You can't "gzip the whole thing" as gzip only compress one file, you could create a tar file and gzip it to "gzip the whole thing" but you would loose rsync capability of copying only modified file.

So the question is: is it better to store file I need to rsync gziped or rely on -z option of rsync.
The answer is probably that you don't want the file unzipped on your server ? I guess yes, so I don't see how you could manage to gzip file before doing the rsync.

May be you don't need the rsync capability of copying only modified file ? In this case why using rsync instead of doing a scp of a tar.gz file containing your stuff ?

Anyway to answer the question, rsync gzip will be a little less efficient than gziping file with gzip. Why ? because rsync will gzip data chunk by chunk, so a smaller set of data will be used to create the table that gzip use to do compression, a bigger set of data (gzip would use the whole file at once) give a better compression table. But the difference will be very very small in most case but in very rare case the difference can be more important (if you have a very large file with very long partern repeating many time on the file but far away from each other) (This is a very simplified example)