I own a website which is running on a VPS since last week. From monday until saturday, everything is going smoothly. The website has around 4.500 unique visitors a day, and the load average and respond time is fine.
On a sunday, the website has around 11.000 unique visitors, because we have offer unique and exclusive content on that day. The content is stored in a MySQL database, which is running on a different VPS server and using the InnoDB engine. This is where things are going wrong. Because of the increase of visitors, the load average will rise to the extreme, until the point where the website will be unreachable.
Here is the top output:
This is an automated message notifying you that the 5 minute load average on your system is 238.37.
This has exceeded the 10 threshold.
One Minute - 237.31
Five Minutes - 238.37
Fifteen Minutes - 231.1
top - 16:41:12 up 5 days, 18:51, 1 user, load average: 238.68, 238.62, 231.25
Tasks: 517 total, 246 running, 271 sleeping, 0 stopped, 0 zombie
Cpu(s): 1.8%us, 0.3%sy, 0.0%ni, 97.6%id, 0.0%wa, 0.0%hi, 0.1%si, 0.2%st
Mem: 3922920k total, 3542968k used, 379952k free, 2736k buffers
Swap: 1048564k total, 105316k used, 943248k free, 142772k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
14395 apache 20 0 313m 13m 4044 R 2.8 0.4 0:09.81 /usr/sbin/httpd -k start -DSSL
13405 apache 20 0 314m 15m 4432 R 2.3 0.4 0:17.87 /usr/sbin/httpd -k start -DSSL
15865 apache 20 0 312m 13m 4176 R 2.3 0.4 0:01.28 /usr/sbin/httpd -k start -DSSL
15930 apache 20 0 310m 11m 4060 R 2.3 0.3 0:00.88 /usr/sbin/httpd -k start -DSSL
15978 apache 20 0 310m 11m 4048 R 2.3 0.3 0:01.08 /usr/sbin/httpd -k start -DSSL
16041 apache 20 0 309m 10m 4052 R 2.1 0.3 0:00.58 /usr/sbin/httpd -k start -DSSL
16082 apache 20 0 211m 4192 2276 R 1.9 0.1 0:00.09 /usr/sbin/httpd -k start -DSSL
14298 apache 20 0 310m 11m 4044 R 0.6 0.3 0:09.56 /usr/sbin/httpd -k start -DSSL
14457 apache 20 0 311m 11m 4068 R 0.6 0.3 0:10.18 /usr/sbin/httpd -k start -DSSL
14486 apache 20 0 310m 11m 4464 R 0.6 0.3 0:06.13 /usr/sbin/httpd -k start -DSSL
15287 apache 20 0 313m 14m 4048 R 0.6 0.4 0:05.21 /usr/sbin/httpd -k start -DSSL
15363 apache 20 0 310m 11m 4064 R 0.6 0.3 0:04.13 /usr/sbin/httpd -k start -DSSL
15400 apache 20 0 313m 13m 4048 R 0.6 0.4 0:04.09 /usr/sbin/httpd -k start -DSSL
15404 apache 20 0 310m 11m 4056 R 0.6 0.3 0:04.22 /usr/sbin/httpd -k start -DSSL
15649 apache 20 0 313m 14m 4432 R 0.6 0.4 0:02.88 /usr/sbin/httpd -k start -DSSL
15675 apache 20 0 310m 10m 4044 S 0.6 0.3 0:02.22 /usr/sbin/httpd -k start -DSSL
15692 apache 20 0 310m 11m 4084 R 0.6 0.3 0:01.46 /usr/sbin/httpd -k start -DSSL
15702 apache 20 0 311m 12m 4044 R 0.6 0.3 0:01.85 /usr/sbin/httpd -k start -DSSL
15719 apache 20 0 310m 10m 4048 R 0.6 0.3 0:02.32 /usr/sbin/httpd -k start -DSSL
15781 apache 20 0 318m 18m 4044 R 0.6 0.5 0:01.91 /usr/sbin/httpd -k start -DSSL
15788 apache 20 0 312m 13m 4048 R 0.6 0.4 0:02.13 /usr/sbin/httpd -k start -DSSL
15823 apache 20 0 310m 11m 4060 R 0.6 0.3 0:02.04 /usr/sbin/httpd -k start -DSSL
15837 apache 20 0 311m 12m 4052 R 0.6 0.3 0:01.64 /usr/sbin/httpd -k start -DSSL
On sunday, the website has to perform a pretty large query, with a couple of left joins on different tables.
The website is running on a VPS, containing 2 x 2.4 Ghz proccessor and 4GB ram. The database is running on a SSD VPS, containing 2 x 2.4 Ghz proccessors, and 2GB ram.
On the specific sunday, I also got this message in the ErrorLog of the server:
Sun Nov 24 15:03:34 2013] [error] server reached MaxClients setting, consider raising the MaxClients setting
The website is created by using the PHP Codeigniter framework, and worked fine the first 8 weeks on a shared hosting (with the same code). After those weeks, the problem started, that's why I decided to move to a VPS server. But the problem seems to be continuing.
I have absolutely no clue where things are going wrong, so any help would he highly appreciated.
Best Answer
the answer to your question is leverge memory caching as much as possible. ie memcache, varnish, etc....... and then use nginx, which you can scale horizontally, and behind it, a php-fpm pool appropriately sized to your load, fully meshed with upsream nginx boxes.
once you get to a certain level of traffic, its not as much as throwing hardware at the problem, as much as leveraging caching, and having individual tiers that can be upgraded/updated individually.
you cant have a super high availability site on a single vps, unless its static html, and even then varnish is ideal.
get a pair of haproxy frontend load balancers, distributing to varnish, pulling from nginx / php / memcache / redis / mysql (postgres)..... thats it in a nutshell :P