Rsync Performance – Maximizing Rsync Performance and Throughput for Directly-Connected Gigabit Servers

linuxnetworkingoptimizationrsync

I have two Dell R515 servers running CentOS 6.5, with one of the broadcom NICs in each directly attached to the other.
I use the direct link to push backups from the main server in the pair to the secondary every night using rsync over ssh.
Monitoring the traffic, I see throughput of ~2MBps, which is much less than I'd expect from a gigabit port.
I've set the MTU to 9000 on both sides, but that didn't seem to change anything.

Is there a recommended set of settings and optimizations that would take me to the maximum available throughput? Moreover, since I am using rsync over ssh (or potentially just NFS) to copy millions of files (~6Tb of small files – a huge Zimbra mailstore), the optimizations I am looking for might need to be more specific for my particular use case.

I am using ext4 on both sides, if that matters

Thanks

EDIT: I've used the following rsync options with pretty much similar results:

rsync -rtvu --delete source_folder/ destination_folder/

rsync -avHK --delete --backup --backup-dir=$BACKUPDIR source_folder/ destination_folder/

Currently, I'm looking at the same level of bad performance when using cp to an NFS export, over the same direct cable link.

EDIT2: after finishing the sync, I could run iperf and found performance was around 990Mbits/sec, the slowness was due to the actual dataset in use.

Best Answer

The file count and SSH encryption overhead are likely the biggest barriers. You're not going to see wire-speed on a transfer like this.

Options to improve include:

  • Using rsync+SSH with a less costly encryption algorithm (e.g. -e "ssh -c arcfour")
  • Eliminating encryption entirely over the SSH transport with something like HPN-SSH.
  • Block-based transfers. Snapshots, dd, ZFS snapshot send/receive, etc.
  • If this is a one-time or infrequent transfer, using tar, netcat (nc), mbuffer or some combination.
  • Check your CentOS tuned-adm settings.
  • Removing the atime from your filesystem mounts. Examining other filesystem mount options.
  • NIC send/receive buffers.
  • Tuning your rsync command. Would -W, the whole-files option make sense here? Is compression enabled?
  • Optimize your storage subsystem for the type of transfers (SSDs, spindle-count, RAID controller cache.)