Sql-server – Best way to scale up an IIS/SQL Server application

iisscalingsql server

We have an application that we developed in ASP and SQL Server. We use Rackspace to host it. Each of our "customers" has their own IIS site and SQL database. Each customer could have a dozen or several hundred users hitting the application.

We now have several hundred customers. What we've done up until now is add another pair of servers (one IIS, one SQL) when we need to for performance reasons. We're up to four pairs of servers now.

We're continuing to add new customers, and we're looking at alternative ways to scale up without simply adding pairs of servers.

One approach we're considering is to have a web farm for the IIS side, and a SQL Server cluster and SAN for the database side.

Is this a good approach? We have maybe 400 customers now, each with their own IIS site and SQL database. What if we grow to 1,000? 5,000? Can one SQL cluster handle many thousands of databases?

Each database has several hundred tables. The main table for some of our bigger customers might have several hundred thousand records.

One attraction towards moving towards a big IIS web farm and a big SQL cluster is that if one server goes down, other servers could carry the load. Right now if one of our SQL Servers goes down, or one of our IIS servers goes down, one fourth of our customers cannot use the system. So we want to increase our reliability as well as increasing our capacity for adding new customers.

Best Answer

A SQL Server cluster is not a scalability solution, is strictly a high-availability solution. In a SQL Server cluster only one node is active and handles load, all the other nodes are just passive stand-by, waiting in case the active node has a hardware failure. As such froma SQL Server cluster you obtain, at best, the performance of a single SQL Server instance. You'll find sometimes references to a so called 'Active-Active' deployment, but that is not a SQL Server cluster. The so called 'Active-Active' is two separate clusters that use each other's node in reverse roles (one cluster in active on node A and passive on B, the other is active B and passive on A).

If your application is already partitioned and each customer has a separate database, then you'd be much better continuing to scale out. The problems of administration and maintenance can be addressed, once you automate the administration of 4 servers (ie. now) you can automate the administration of 1000 servers. So are the problem of versionning, deployment and account provisioning.

While right now you may be attracted by a perceived ease of administration, troubleshoot and tunning from a scale up solution, you are going to have new problems, more difficult to solve, culminating with a limit on how high can you scale up.

Updated After you edited the OP with explanation on concerns related to scale out.

Indeed scale up and clustering offer a high availability solution. There are basically two HA solutions for SQL Server: clustering and database mirroring. Since you plan for thousands of individual databases, this pretty much rules out mirroring. What you have to keep in mind though is that SQL Server clusters are supported only on a restricted list of hardware combinations:

Microsoft server clusters are only supported on cluster solutions that are listed in the Windows Server Catalog under Cluster Solutions. To view the Windows Server Catalog, visit the following Microsoft Web site: http://www.windowsservercatalog.com/

That doesn't mean you cannot run a cluster on any hardware pair/group that has a shared SCSI buss. It means you won't be able to request support from Microsoft CSS if you have an issue on a hardware that is not approved.

SQL Server clustering offers protection against hardware failures, except media failure (since the media is shared between nodes). What it doesn't offer protection from is human administration errors, which unfortunately are the primary reason for down-time. With one thousand customer databases on a single cluster, you'll have a very large omelette cooking in one basket...

As for you question about how many databases can a cluster run, ultimately a cluster has only one active instance at any time, so the limits of a single instance apply to a cluster as well. Having one thousand databases attached is not a big thing, the real question is the load, how many users will be running queries simultaneously. To get a ballpark region of numbers, consider that the TPC-C benchmarks published for SQL Server are for some 1.2 mil transactions per minute on a 64 way Superdome, which means roughly 16k transactions per second. You anticipate about 5k clients connected, that gives you a margin of 3 queries per client per second, and to reach that you need a pretty beeffy hardware and an exceptionally well tunned application.