SQL Server 2008 Replication & Sync Framework – Practical Limits

replicationsql-server-2008synchronization

We are looking at hosting our client's data on our own system and using the MS Sync Framework to replicate data with a SQL Server instance on each client's site. Each client might have on average about 500MB of data and there could be up to 1,000 client sites, so we would have about 500GB of data overall. Data changes will need to be replicated in both directions.

Has anyone attempted a distributed system with this number of clients? Is there a practical limit?

Best Answer

I've been involved with a project that used the Sync Framework to synchronize remote sites running SQL Express 2008 - some on a WAN, some on ADSL (and originally some even on dialup, though those were upgraded to 3G wireless modems) to a central SQL Server.

We didn't scale up to 1,000 - probably only about 60-80 sites by near the end of my time at that client. We found the sync framework quite reliable. The only problem we did have was some bizarre behaviour when we were using SQL Server's Change Tracking feature.

We'd noticed that occasionally we would "miss" data changesets, and it was a fair way into the rollout when we managed to be able to reproduce this in a repeatable way, and so contacted Microsoft PSS.

In the end, they confirmed that essentially sometimes SQL Change Tracking will report an update as a delete/insert, which completely confuses the Sync Framework. Faced with this we had to resort to rolling out our own implementation of change tracking using triggers. This proved to be beneficial both because a) it actually worked properly and b) we had more control over the logging and auditing of changes.

This application was written using version 1.0 of the Sync Framework, which meant our team had to adapt a custom provider for the client side. v2.0 now comes with it's own SqlSyncProvider so that's one less thing to worry about.

Another thing to consider is whether to use direct SQL connections or to utilise the WCF provider support. WCF is probably the way to go if your client sites are accessing the main server over the Internet - then you can add security/encryption as required. You can even consider optimising the WCF transport layer to include some kind of compression.

It's probably worth using something like Wireshark to compare network usage if you're heading down that path too. Better to have an idea up front what your bandwidth requirements are going to be!

-dave