You are running a log-heavy database on just about the slowest disks out there. The reason you're seeing jdb2 on such high iowaits is because it's having a hard time keeping up writing the ext4 journals to disk
Either change disks to something that can keep up (SSD's come to mind, allthough regular SAS disks with a battery cache would work) and/or start tuning stuff. There are plenty of documentation out there when it comes to filesystem tuning in SQL environments.
A good starting point is here: http://dev.mysql.com/doc/refman/5.0/en/disk-issues.html
I think you have a query that is locking a table/some rows that other mysql connections are trying to update for longer than it should. When that happens, all the incoming requests stack up behind it until you hit the max connections.
The same is happening on the Apache side due the requests coming in and not having received a response (due to queries being blocked on the database). PHP has an open connection to the database; it has made a query and has yet to receive a response. Apache 'hanging' at that point is what you would expect it to do since it's waiting for an answer.
Apache appears to be hung from the outside (your browser/mobile app/etc) because all of the children available across all your servers are stuck waiting on a database reply. There are literally no more connections available. (This could also be a connection limit set on your load balancer as well). If you aren't already, start logging state changes on your load balancer. You will likely see each of your web servers going up and down repeatedly while the 'thundering herd' problem (explained later) is occurring.
I believe your connections in CLOSE_WAIT are a symptom, not a problem. I wouldn't spend any time trying to troubleshoot that angle until I've taken care of the more obvious possible problems (database). Odds are once you fix that your huge number of CLOSE_WAITs will go away.
To start troubleshooting on the database side, you should enable the slow query log if you haven't already done so. Have it log requests over 1 second or so and see what shows up when the problem occurs.
Note: The slow query log will not log the query until the query has completed. Do not assume that the first query showing up when the problem starts is the problem query. It may or may not be.
Now, you might expect the website to return to normal once the problematic query blocking other queries has finished...
Not so. If you have 500 requests/sec coming in regularly and can handle say 1000 requests/sec total and your query locks up the database for 10 seconds. There are now 5,000 requests WAITING to be handled in addition to the 500/sec that are still coming in. This is known as the Thundering Herd problem.
Your problem could be something else entirely, but these are the exact same symptoms of a problem I have dealt with many times and in most of those cases the problem was a database query blocking other queries. The only other time I've run into this problem that was not due to the database was on CentOS (RHEL has the problem too) 6. Unfortunately Red Hat has the knowledge base article discussing this problem available to subscribers only but there are other references around if you search for them. If you think this might be the case, it's mind-numbingly easy to test. You simply need to add a single line to your resolv.conf.
If the problem seems to show up at the same/close to the same time of the day when it happens
you should check your cron jobs (or anything else being run on a set schedule) to see if the problem query is being sent from that.
Finally, if you do determine that you are being bitten by the thundering herd problem I'd suggest setting limits on your load balancer. You should benchmark a server to determine roughly the max number of requests it can handle simultaneously and limit the load balancer from exceeding that number of connections to each back end web server.
Good luck.
Best Answer
Check under /var/lib/mysql for .err files.
Turn off MySQL and run repair commands:
Identify all corrupted tables using myisamchk
myisamchk /var/lib/mysql/bugs/*.MYI >> /tmp/myisamchk_log.txt
Repair the corrupted table using myisamchk
myisamchk -r profiles.MYI
Perform check and repair together for entire MySQL database
myisamchk --silent --force --fast --update-state /var/lib/mysql/bugs/*.MYI
We need more information to better help you. Ideally, you might want to hire a DB admin to avoid data loss or more corruption.
You might have a bad drive. You might need to also run fsck. You might have bad RAM.
Honestly, there are so many things that could be causing this. I would hire an admin.