Rsync huge dataset of small files 5TB, +M small files

copyrsynctransfer

I encountered a situation where an app server misconfig led to creation of around 5TB datasets where each dir contains huge number of small files.
We are in the process of transferring the files and change the application but the rsync fails on transferring the data. It is fails even locally between the local drives. I managed to copy only 3.5G overnight!
I tried to change the rsync switches and still no luck.
Here is what currently running on the server without any progress indication:
rsync -avhWc –no-compress –progress source destination
Some suggested the gigasync but the github and the site is unavailable.
Can anybody suggest a method to transfer the files?
Appreciate anyhelp

Best Answer

Try xargs+rsync:

 find . -type f -print0 | xargs -J % -0 rsync -aP % user@host:some/dir/

You can control how many files to pass as source to each call of rsync with -n E.g. to copy 200 files at every rsync:

 find . -type f -print0 | xargs -n 200 -J % -0 rsync -aP % user@host:some/dir/

If it's too slow you can run multiple copies of rsync in parallel with the -P option:

find . -type f -print0 | xargs -P 8 -n 200 -J % -0 rsync -aP % user@host:some/dir/

This will start 8 copies of rsync in parallel.