Linux – rsync take too long to run

linuxload balancingrsync

I have a load balancer setup involve 2 server. these 2 server mirror each other. the main usage of the blanacer is serving static files. Let's call them Server A and Server B.

Server A will retrieve file from remote host on different network. those remote files being retrieved are media files for a community website, so the rsync need to run every 30 minutes in order for the files to stay in sync. Other wise user will see broken images etc. Server A is also serving the files via http, peak time at 400MB/S

Server B will rsync with files on Server A, in order to keep consistency, rsync is running every 30 min as well. Server B is also serving the files via http, peak time at 400MB/S

The load on A and B have been very high load average: 8.00, 8.10, 7.68 and more

How can I improve my setup to reduce server load and improve rsync efficiency ?

thank you

Best Answer

It depends on what is causing this high processor utilization. If the high processor utilization is caused by Rsync generating the file checksums, there are some things you can do.

You may not need checksums at all. By default, rsync decides a file is different based on modification time and file size. If you add the "-c" option, it will decide a file is different by comparing checksums. Omit the option if you don't need checksums.

If you do need checksums, there are some circumstances where checksum caching may work. If the files you are syncing do not change often, you can generate the checksums once per day in a cron job, and rsync will use the generated checksums. Rsync will still generate checksums for any new files or for any files that have a different modification time or size from when the checksum was created.

This info is based on rsync 3.0.5 but should work the same in 3.0.6. You'll need to recompile rsync; the checksum caching is a patch. Here's what I used to compile rsync:

rsync_version="3.0.5"
scriptroot="Set this to your working directory."
mkdir -p $scriptroot/rsync-source/rsync-working
cd $scriptroot/rsync-source/rsync-working
tar xvzf ../rsync-${rsync_version}.tar.gz
tar xvzf ../rsync-patches-${rsync_version}.tar.gz
cd $scriptroot/rsync-source/rsync-working/rsync-${rsync_version}
patch -p1 < patches/checksum-reading.diff
./configure
make

Then use rsyncsums to generate the checksums. When invoking rsync, use the "--sumfiles=lax" option.