Estimating the time needed to copy a big file

cpscp

I have to copy a relatively big file (6GB) to another server. To enforce consistency, I have to 'lock' this file and not use it during the transfer. I obviously want to minimize this downtime, but before this, I also want to estimate the time it will take to do so.

I can see 2 solutions :

  • copy the file locally and then transfer over the network (local). I'm thinking this will lead to the smallest downtime, but then, maybe I'm wrong as the disk will have to write and read the same file at the same time. Even with this, I have no idea how to estimate the time it will take to copy the file locally. Maybe creating a 6GB file and copying it would be meaningful, but then, how can I do this?

  • copy the file over the network directly (scp). This seams slow as network access is orders of magnitude slower than disk writing, but again, maybe I'm wrong.

Generally, how can I benchmark both approaches and decide which one is best for me?

Best Answer

Honestly... there's several ways to attack this. I have a few servers where I need to copy files that are in constant use... so I schedule a "maintenance window" of a few minutes where I can lock the files... create a snapshot (using dm-snapshot), bring the server back online and then do whatever with the snapshot'ed copy of the file. This allows me to have a stable version of the file... and bring the service back online with minimal down-time. Afterwards... I can remove the snapshot without affecting the live server. In your case... you can copy the 6gb snapshot'd file at your own pace as-needed.... without worry about the live-version of the file messing up the copy process.

http://tldp.org/HOWTO/LVM-HOWTO/snapshots_backup.html