DFS-R or WAFS for Syncing 1TB Files between remote sites

dfsdfs-rremotereplicationwide-area-network

We have 1TB of files that we wish to keep in sync between three locations.
This data consists of around 1.5 Million files which grows/changes at an average of 100MB/day.

The first location is a Queensland office which has a dedicated ADSL2+ syncing at 20000/1024kbps connected to a co-located server in a data centre on a 10MB link.
The remote site is an office in Argentina which is on a measly SDSL 512/512 link. This could possibly be upgraded to a maximum of 2Mb SHDSL.

The structure we want is:
QLD Office <–> QLD Datacentre (For Offsite Backup Only) <–> Argentina Office

I ran some speed tests from speedtest.net from the QLD office and the Argentina Office:

Upload speed from QLD office: 0.95Mb/s

Upload speed from Argentina office: 0.51Mb/s to local server, 0.38Mb/s to QLD Server 🙁

We can’t use standard sync software (rsync, viceversa etc) to keep this data in sync because of the time zone differences make it really complex so we are considering a distributed file system solution.

We are trying to work out what is more viable – using Microsoft DFS-R with Windows 2008 servers or a third party WAFS provider such as GlobalScape WAFS.

I’m trying to find a solution that will do the following:

  1. The ability to initially populate the files by posting an external hdd to the remote sites etc.
  2. The ability to manually populate after it is set up – i.e. if we need to add 200GB of files we can post the drive again with this data and add it at all sites manually
  3. The ability to account properly for the large time zone difference of the remote sites
  4. Offline availability of files – i.e. if internet connections are down we can still view/edit files
  5. Either Delta Sync compression or file compression – a lot of files we work on are large ascii files which are easily compressed with a large ratio so compression would be nice.
  6. File locking would be nice but is not essential since the files are in different time zones
  7. The ability to do all of the above with the slow internet links as described.

I have looked at GlobalScape WAFS which looks promising because unlike most DFS you can point it at where the files are located instead of copying them into the share, but with a quote of $10kUSD for 3 agents for the software alone I was wondering if there are any better solutions.

Microsoft DFS-R looks good also but I can’t find much information as to if it would handle this many files over a very slow link.

Any suggestions/directions to point me in would be much appreciated

Best Answer

We currently use Server 2008 DFSR to transfer 900 GB of files, with about 3 GB changing daily. Our topology is a single hub, with 3 spokes. Each spoke is on a 4Mb/1Mb ADSL connection, separated by roughly 300-500KM. Our hub site has a 10Mb/10Mb connection.

Other than the lack of file locking, after some initial configuration problems DFSR has been running smoothly, and we are very pleased with it. I would highly recommend using Server 2008 or Server 2008 R2 for DFSR, as there are MANY improvements that will help with your slow links.

In answer to your questions:

  1. You can pre-seed the data using an external hard drive at the remote site, to reduce initial replication.
  2. I'm pretty sure you can't add additional information in a pre-seed fashion (your 200 GB example), as once initial replication is done it becomes a multi-master topology.
  3. Time zone shouldn't have any affect, especially if you aren't modifying the default replication schedules.
  4. Offline Access - A local copy is stored in each spoke server, so you will have access to it if the WAN is down. Once the WAN comes back up, replication will continue.
  5. DFSR uses RDC and only replicates changes, so you will see great reduction in transfer sizes. Our current replication reports a 57.88% savings, with 74.36 GB received from an actual 176.55 GB real size. This is since last service restart.
  6. File locking is not supported in DFSR, however conflicts can be monitored through the event logs.
  7. While not ideal, it should work with your slow links, as we have similar links.

I would not recommend Globalscape WAFS, based on the last comment (mine) in this blog post: http://blogs.technet.com/b/askds/archive/2009/02/20/understanding-the-lack-of-distributed-file-locking-in-dfsr.aspx?CommentPosted=true&PageIndex=2#comments Perhaps the product has changed since then, but it's only been a few months.