Transfer large amount of small files

file-transferlinux-networkingvmware-vsphere

I've to do a migration of two servers with large SAN attachments to our new VMWare environment.

EDIT: I have to supply some additional intelligence as I have good answers regarding VMWare solution.

Ok so, I can't attach a previous EMC LUN on the New system due to some technologies limitations on the server.

I can't use VMWare Converter to clone the missing volumes on my new VM as VMWare Converter can't see EMC PowerPath Pseudo-devices and that the previous admin used those Pseudo-device to built LVM2 and/or ASM volumes on top of.

Those two physicals servers are attached to an old EMC² CX-340 SAN and handle 5TB of data.

Those 5TB of data are small PDF and I need to transfer them to the new machine through our 1Gbit/s LAN.

I've tried using rsync, but it's really to slow and have a strong impact on the RAM and CPU performance.

I've try using NC with TAR but the transfer rate is quite slow as I've an average throughput of about 50MB/s on a 1Gbit/s link with barely zero traffic.

Could you give me some advice or return of experience with this kind of migration and how you manage to have it finished correctly within reasonable amount of time?

Best Answer

If you really need a quick way to transfer files, and both systems are Linux-based, you can try UDR.

This is really a form of rsync-over-UDP (using the open-source UDT framework) and is particularly handy for moving large numbers of files or transferring over high-bandwidth or high-latency links. In addition, encryption is disabled by default, so the RAM/CPU hit is lower than traditional rsync. SSH is not involved either.

I can easily get wire-speed transfers over 1Gbps with 10-million small TIFF files in a directory tree.

Your syntax will be slightly modified from rsync. All rsync flags need to appear before the source/destination specification:

udr rsync -avP --stats --delete /data/ server2:/data/

Easy to build... You'll need g++ and openssl-devel:

git clone https://github.com/LabAdvComp/UDR.git
cd UDR/
make
cp src/udr /usr/local/bin/

Do that on the source and destination.


See: Possibility of WAN Optimization for SSH traffic