Rsync – Why Is Backup with Rsync Painfully Slow?

amazon-web-servicesbackupiolinuxrsync

I'm using rsync to perform backups from an NFS server hosted in AWS to another EC2 instance in the same AZ, the rsync command that I'm using in the backup server is the following:

 rsync -avzb --backup-dir=someDirectory 172.19.0.151:/origin/* /opt/destination/

My backup consists of several thousands, or may be more than a millon, small files (.pdfs around 200/500 KB)

The issue that I'm having is that the incremental file list is sent very quickly (so far so good) but when rsync starts copying files is painfully slow, it copies like or 20 files, then it stops for 3 or 4 minutes, then it copies some more files, and so on.

I'm running the rsync process in a crontab every 2 hours and most of the times I have a very long list of unfinished rsync processes which forces me to reboot the server.

This is my iowait:

iostat 
Linux 4.4.0-1060-aws (prd-turecibo-backup)  06/06/2018  _x86_64_    (2 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.29    0.00    0.74   84.56    0.00   14.41

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
nvme1n1           0.22         6.00         0.00      20984          0
nvme2n1         240.45       959.12        17.95    3354430      62788
nvme0n1           3.43        65.19         2.03     227995       7112

Usually iowait is at 90%.

I've replaced the EBS volume and the problem persists, algo upgraded the intance but no luck.

Any ideas?

Best Answer

I'm running the rsync process in a crontab every 2 hours and most of the times I have a very long list of unfinished rsync processes which forces me to reboot the server.

Start with making only one instance of rsync-backup running
e.g. by using flock command in shell script.

You may alternatively use run-one if you distribution provides it. [suggested by Michael - sqlbot]
It seems to be provided by Linux/Ubuntu.