Optimizing on sending files via site-to-site vpn connection

scp

I've got a site-to-site vpn connection between two data centers (one in San Jose, the other in Toronto).

I need to send a 32GB file from one dc to another – FAST AS POSSIBLE.

I've found a shell script that splices up the 32GB file smaller files and then uses scp to transfer over in parallel fashion.

The question is how do I determine the optimal file size for sending over the various little files across the site-to-site vpn connection (I'd like to try to maximize the bandwidth).

Obviously the more scp process I run on the server, I guess there's more load put on that server.

Best Answer

Forget about the site-to-site for a minute, because as long as its ipsec and your endpoints are not toasters its unlikely to be bottleneck in itself, and have a quick look at bbcp:

http://www.slac.stanford.edu/~abh/bbcp/

Here is a line from the perl script we used during the last migration, which had the same requirements as you i.e. move the data fast

sprintf('/usr/local/bin/bbcp -a -F -s 16 -P 10 -T "ssh -x -a -oFallBackToRsh=no %%I -l %%U %%H /usr/local/bin/bbcp" -d . -v %s %s:%s',
  join(' ', @files_to_copy), $remote_host, $destination_dir);

Play with the options, especially the number of threads.

Questions you will want answered are:

  • what is the latency of the links
  • what is the packet jitter likely to be like
  • what is the total maximum bandwidth I can expect
  • who/what else will I stomp on by hogging the entire link

bbcp should be able to max any link to the point where cpu becomes your bottleneck with the right flags. Good luck