Mysql – How to find out what is causing a slow down of the application on this server

apache-2.2MySQLperformance

This is not the typical serverfault question, but I'm out of ideas and don't know where else to go. If there are better places to ask this, just point me there in the comments. Thanks.


Situation

We have this web application that uses Zend Framework, so runs in PHP on an Apache web server. We use MySQL for data storage and memcached for object caching.

The application has a very unique usage and load pattern. It is a mobile web application where every full hour a cronjob looks through the database for users that have some information waiting or action to do and sends this information to a (external) notification server, that pushes these notifications to them. After the users get these notifications, the go to the app and use it, mostly for a very short time. An hour later, same thing happens.

Problem

In the last few weeks usage of the application really started to grow. In the last few days we encountered very high load and doubling of application response times during and after the sending of these notifications (so basically every hour). The server doesn't crash or stop responding to requests, it just gets slower and slower and often takes 20 minutes to recover – until the same thing starts again at the full hour.

We have extensive monitoring in place (New Relic, collectd) but I can't figure out what's wrong; I can't find the bottlekneck. That's where you come in:

Can you help me figure out what's wrong and maybe how to fix it?


Additional information

The server is a 16 core Intel Xeon (8 cores with hyperthreading, I think) and 12GB RAM running Ubuntu 10.04 (Linux 3.2.4-20120307 x86_64). Apache is 2.2.x and PHP is Version 5.3.2-1ubuntu4.11.

If any configuration information would help analyze the problem, just comment and I will add it.

Graphs

info

collectd

New Relic

(Sorry the graphs are gifs and not the same time period, but I think the most important info is in there)

Best Answer

Hello Unfortunately It is difficult to find an immediate solution to your problem especially without a sys admin. This said I think that you can at least solve (on the long run) the memory hungry apache -as it seems from your relic report - with a nginx + apache + php to scale up the speed of your app considerably see http://www.richweb.com/nginx or simply search on google /ask your sys admin. As you use Zend framework you might also consider Zend Server http://www.zend.com/en/products/server/

Sorry if these ideas seems a bit generic and do not solve your immediate problem but, over the long term, this could provide you with a good solution.

Another suggestion is that, if you application tends to scale in the future, you might consider testing the same setup you have on amazon EC2 http://aws.amazon.com/ec2/ This would give you several benefits like:

  1. Scalability: you can run one main instance and clone it to a second instance to run anytime there is a need (e.g. something is slow and you want to see if it is the same on the second instance) or better with two instances and a load balancing.
  2. Cloning: with a traditional server it is fairly complicate to clone everything and be up and running in minutes. With EC2 you have this advantage;
  3. Simplicity: with the new EC2 web based interface you will not need your sys admin to launch a new instance and test it with a new IP (that can be assigned on the fly).

This might sound as a random suggestion not addressing your problem but, from personal experience, sometimes having the possibility to scale becomes essential as your business grow.

For immediate assistance in your case only a sysadmin can help you. If you want to hire one you might consider https://www.odesk.com but again you need to select a reliable one with very good feedback. If you want just a consultancy without having the admin to enter your server I am sure that you can find several ones available at very reasonable rates (20/30$ an hour) to give your some feedback.

Related Topic