My guess is that you have some long running queries in your application. When they are executed they cause the connection to stay checked out of the pool for a long time (relative to the usual usage pattern), this causes your pool to become exhausted, grow, and continue to grow up to its maximum, at which point any remaining workers block waiting on connections to be released.
The first thing will be to track down when this happens, that is, is it a cyclical event, or random. If its the former you're in luck, as you can be ready time it happens. If you can't determine a pattern then you'll have to be vigilant.
You may be able to figure this from looking at your website monitoring logs, or sar
from your database to see if there are any correlating spikes.
If you can catch your database when its under load, you should execute the following commands on the mysql server
show innodb status;
show processlist;
The former will print out diagnostic information about the innodb engine (you are using innodb right?), the latter will print out the first few hundred chars of the query that was executing. Look for queries that have been running for a long time, queries generating temporary tables on disk, and queries that are blocked on a resource.
After that, the hard work begins. Use EXPLAIN
to estimate the cost of the query, and the resources it uses. Avoid queries that require sorting on disk via a tmp table. Look for long running reporting jobs, or other scheduled maintenance tasks that periodically lock or saturate your database. It could be something as simple as the backup task, or a job that rolls up old purchase order data.
I recommend having these three settings in your /etc/my.cnf
log_slow_queries
log-queries-not-using-indexes
set-variable = long_query_time=1
For a web application doing 20-30 requests per second, you can't afford to have anything show up in these logs.
btw, IMHO its pointless to increase your connection pool's size beyond your original size as this will only delay the onset of pool exhaustion by at best, a few seconds, and only put more pressure on your db right when it doesn't need it.
When your application platform and your database are competing for resources, that's usually the first indication that you're ready for a dedicated database server.
Secondly, high-availability: setting up a database cluster (and usually in turn, a load-balanced Web/application server cluster).
I would also say security plays a large role in the move to separate servers as you can have different policies for network access for each server (for example, a DMZ'ed Web server with a database server on the LAN).
Access to the database server is over the network. i.e. when you're usually specifying "localhost" for your database host, you'd be specifying the host/IP address of your database server. Note: usually you need to modify the configuration of your database server to permit connections/enable listening on an interface other than the loopback interface.
Best Answer
Yes. MySQL has magical logs. For example:
You'll need to check your
my.cnf
to see which logs are enabled and where they are located. You'll also need to check your version of MySQL to find out what configuration options exist for each log, and how to enable it if it's not already.Once you do, you're just a
less
,grep
, and maybe ased
or two away from finding what took your server down.