Are there good ways to transfer very large files (2TB) over the Atlantic

file-transferinternetnetworking

My company regularly needs to send about 2TB of data from the US to the UK (the size of the compressed delta is 2TB). Even though each side has good internet connectivity, sending files directly is too slow and unreliable. At 1MB/s, the transfer would take more than 20 days, if it completes without error.

As a result, the best solution found so far is to "ship the brick", ie. sending a hard drive by regular mail.

I was wondering if there exists any sort of service that offers better network connectivity across continents? I considered going through AWS S3, but their outbound transfer prices are quite expensive…

Note: the problem is not the software. We use rsync already. It works well and is robust. The problem is the speed and reliability of over-the-Atlantic internet connections. As an answerer said, a dedicated link is not in our budget. What I'm looking for is a cost-effective solution that would be a little more practical than shipping a disk.

Best Answer

Well, your queston lacks certain important information:

  • Can the file(s) change between adjacent attempts at transferring them?
  • What platform the server and the client run on?

Still, a couple of options:

  • Plain old HTTP supports download resumption — via the Range field in the request header.

    So if you have a server supporting that (actually any production software such as nginx, apache, lighthttpd and gazillions other, receiving the whole file would amount to running something like this on the client:

    while true; do
        wget -nd -c http://server:port/path/to/the/file && break
    done
    
  • Advanced software such as rsync supports resuming of file transfer using advanced techniques which allow to synchronize two directory hierarchies even in the presence of file updates between the adjacent synchronization sessions.

    I'm not sure, but on Windows™, robocopy should be able to server as a poor man's rsync: it's not that good at supporting updates on the source side but IIRC it's able to resume transfers.

  • There exist other "do-it-no-matter-what" synchronization tools such as SyncThing.

Note that HTTP and robocopy expect you have a regular network connectivity between the server and the client; if it's provided by a VPN you might need to look at tuning its performance.

rsync is able to use SSH to spawn and talk with the remote rsync instance; and you might need to tweak that SSH call to make it use the fastest available cypher, turn off compression etc.