The short answer is: Nobody can answer this question except you.
The long answer is that benchmarking your specific workload is something that you need to undertake yourself, because it's a bit like asking "How long is a piece of string?".
A simple one-page static website could be hosted on a Pentium Pro 150 and still serve thousands of impressions every day.
The basic approach you need to take to answer this question is to try it and see what happens. There are plenty of tools that you can use to artificially put your system under pressure to see where it buckles.
A brief overview of this is:
- Put your scenario in place
- Add monitoring
- Add traffic
- Evaluate results
- Remediate based on results
- Rinse, repeat until reasonably happy
Put your scenario in place
Basically, in order to test some load, you need something to test against. Set up an environment to test against. This should be a fairly close guess to your production hardware if possible, otherwise you will be left extrapolating your data.
Set up your servers, accounts, websites, bandwidth, etc. Even if you do this on VMs that's OK just as long as you're prepared to scale your results.
So, I'm going to set up a mid-powered virtual machine (two cores, 512 MB RAM, 4 GB HDD) and install my favourite load balancer, haproxy
inside Red Hat Linux on the VM.
I'm also going to have two web servers behind the load balancer that I'm going to use to stress test the load balancer. These two web servers are set up identically to my live systems.
Add Monitoring
You'll need some metrics to monitor, so I'm going to measure how many requests get through to my web servers, and how many requests I can squeeze through per second before users start getting a response time of over two seconds.
I'm also going to monitor RAM, CPU and disk usage on the haproxy
instance to make sure that the load balancer can handle the connections.
How to do this depends a lot on your platforms and is outside of the scope of this answer. You might need to review web server log files, start performance counters, or rely on the reporting ability of your stress test tool.
A few things you always want to monitor:
- CPU usage
- RAM usage
- Disk usage
- Disk latency
- Network utilisation
You might also choose to look at SQL deadlocks, seek times, etc depending on what you're specifically testing.
Add traffic
This is where things get fun. Now you need to simulate a test load. There are plenty of tools that can do this, with configurable options:
Choose a number, any number. Let's say you're going to see how the system responds with 10,000 hits a minute. It doesn't matter what number you choose because you're going to repeat this step many times, adjusting that number up or down to see how the system responds.
Ideally, you should distribute these 10,000 requests over multiple load testing clients/nodes so that a single client does not become a bottleneck of requests. For example, JMeter's Remote Testing provides a central interface from which to launch several clients from a controlling Jmeter machine.
Press the magic Go button and watch your web servers melt down and crash.
Evaluate results
So, now you need to go back to your metrics you collected in step 2. You see that with 10,000 concurrent connections, your haproxy
box is barely breaking a sweat, but the response time with two web servers is a touch over five seconds. That's not cool - remember, your response time is aiming for two seconds. So, we need to make some changes.
Remediate
Now, you need to speed up your website by more than twice. So you know that you need to either scale up, or scale out.
To scale up, get bigger web servers, more RAM, faster disks.
To scale out, get more servers.
Use your metrics from step 2, and testing, to make this decision. For example, if you saw that the disk latency was massive during the testing, you know you need to scale up and get faster hard drives.
If you saw that the processor was sitting at 100% during the test, perhaps you need to scale out to add additional web servers to reduce the pressure on the existing servers.
There's no generic right or wrong answer, there's only what's right for you. Try scaling up, and if that doesn't work, scale out instead. Or not, it's up to you and some thinking outside the box.
Let's say we're going to scale out. So I decide to clone my two web servers (they're VMs) and now I have four web servers.
Rinse, repeat
Start again from Step 3. If you find that things aren't going as you expected (for example, we doubled the web servers, but the reponse times are still more than two seconds), then look into other bottlenecks. For example, you doubled the web servers, but still have a crappy database server. Or, you cloned more VMs, but because they're on the same physical host, you only achieved higher contention for the servers resources.
You can then use this procedure to test other parts of the system. Instead of hitting the load balancer, try hitting the web server directly, or the SQL server using an SQL benchmarking tool.
Disk & RAM Capacity Planning
Planning disk and memory capacity for a database server is a black art. More is better. Faster is better.
As general guidelines I offer the following:
- You want more disk space than you'll EVER need.
Take your best estimate of how much disk space you'll need for the next 3-5 years, then double it.
- You'll want enough RAM to hold your database indexes in memory, handle your biggest query at least two times over, and still have enough room left over for a healthy OS disk cache.
Index size will depends on your database, and everything else depends heavily on your data set and query/database structure. I'll offer up "At least 2x the size of your largest table" as a suggestion, but note that this suggestion breaks down on really large data warehousing operations where the largest table can be tens or hundreds of gigabytes.
Every database vendor has some instructions on performance tuning your disk/memory/OS kernel -- Spend some time with this documentation prior to deployment. It will help.
Workload Benchmarking and Capacity Planning
Assuming you haven't deployed yet…
Many database systems ship with Benchmarking Tools -- For example,
PostgreSQL ships with
pgBench.
These tools should be your first stop in benchmarking database performance. If possible you should run them on all new database servers to get a feel for "how much work" the database server can do.
Armed now with a raw benchmark that is ABSOLUTELY MEANINGLESS
let's consider a more realistic approach to benchmarking: Load your database schema and write a program which populates it with dummy data, then run your application's queries against that data.
This benchmarks three important things:
1. The database server (hardware)
2. The database server (software)
3. Your database design, and how it interacts with (1) and (2) above.
Note that this requires a lot more effort than simple pre-built benchmarks like pgBench
: You need to write some code to do the populating, and you may need to write some code to do the queries & report execution time.
This kind of testing is also substantially more accurate: Since you are working with your schema and queries you can see how they will perform, and it offers you the opportunity to profile and improve your database/queries.
The results of these benchmarks are an idealized view of your database. To be safe assume that you will only achieve 50-70% of this performance in your production environment (the rest being a cushion that will allow you to handle unexpected growth, hardware failures, workload changes, etc.).
It's too late! It's in production!
Once your systems are in production it's really too late to "benchmark" -- You can turn on query logging/timing briefly and see how long things take to execute, and you can run some "stress test" queries against large data sets during off hours. You can also look at the system's CPU, RAM and I/O (disk bandwidth) utilization to get an idea of how heavily loaded it is.
Unfortunately all these things will do is give you an idea of what the system is doing, and a vague concept of how close to saturation it is.
That brings us to…
Ongoing Monitoring
All the benchmarks in the world won't help you if your system is suddenly seeing new/different usage patterns.
For better or worse database deployments aren't static: Your developers will change things, your data set will grow (they never seem to shrink), and your users will somehow create insane combinations of events you never predicted in testing.
In order to do proper capacity planning for your database you will need to implement some kind of performance monitoring to alert you when database performance is no longer meeting your expectations. At that point you can consider remedial actions (new hardware, DB schema or query changes to optimize resource use, etc.).
Note: This is a very high level and generic guide to sizing your database hardware and figuring out how much abuse it can take. If you are still unsure about how to determine if a specific system meets your needs you should speak to a database expert.
There is also a Stack Exchange site specifically dedicated to database management: dba.stackexchange.com. Search their question archive or browse the tags specific to your database engine for further advice on performance tuning.
Best Answer
You actually need to test two things:
For finding the breaking point you can use apache benchmark (ab) for very raw stress testing or Selenium for more complex things. Regarding scaling you have to redo the previous tests with the application running on more powerfull hardware (scale up). Then with your app running from multiple webservers (scale out). Similar for the database backend but that's a littly trickier. When doing this type of tests it would help to have your machines monitored with tools like collectd/munin/zenoss and slow query log enabled in mysql so you can easily pinpoint any bottlenecks. For scale-out testing you can use Amazon's EC2 - 240$ , 100 servers for one hour if I remember correctly. The general idea when trying to profile a (web) application is to have a clear scaling path - i.e. where exactly you need more capacity, how you can provision it to the production environment and when you should do it.