I have an ubuntu apache/php server running php doing appx 100 hits/sec and a PHP cron running in the background.
I get occasionally high CPU load on one of the Apache processes which stays high regardless of traffic or cron activity. It seems to me that its stuck in some kind of loop or something.
Below you will find the top and strace info.
How can I find where the bad code is and what causes this?
top - 14:45:24 up 3 days, 3:38, 1 user, load average: 5.10, 5.88, 5.85
Tasks: 163 total, 5 running, 158 sleeping, 0 stopped, 0 zombie
Cpu(s): 47.8%us, 18.5%sy, 0.0%ni, 10.2%id, 0.0%wa, 0.0%hi, 1.8%si, 21.6%st
Mem: 7885012k total, 3858484k used, 4026528k free, 177444k buffers
Swap: 0k total, 0k used, 0k free, 1037868k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10736 www-data 20 0 769m 559m 478m R 69 7.3 29:08.30 apache2
10844 www-data 20 0 824m 601m 492m S 17 7.8 4:37.90 apache2
1016 root 20 0 242m 25m 4628 S 6 0.3 162:07.93 scalarizr
9030 www-data 20 0 879m 619m 492m S 4 8.0 5:06.82 apache2
20216 www-data 20 0 747m 228m 170m S 4 3.0 0:01.94 apache2
10807 www-data 20 0 814m 584m 492m S 3 7.6 4:54.10 apache2
10455 www-data 20 0 831m 574m 492m S 3 7.5 4:32.65 apache2
10495 www-data 20 0 849m 592m 492m S 3 7.7 4:41.10 apache2
10884 www-data 20 0 840m 581m 492m S 3 7.6 4:25.06 apache2
^CProcess 10736 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
74.55 0.148052 1 109755 gettimeofday
25.36 0.050370 0 164634 clock_gettime
0.09 0.000178 0 54878 poll
------ ----------- ----------- --------- --------- ----------------
100.00 0.198600 329267 total
root@ec2-67-202-54-36:~# ^C
Best Answer
I would recommend enabling Apache mod_status and turn ExtendedStatus On. Slicehost has an excellent article on how to accomplish this (I would use the "elinks" package vs. "lynx" but that is a personal preference). When you view the Apache server-status URL, there will be a PID, VHost and Request columns - they should go a long way towards pinpointing the URI that is being called which you can use to trace back to the specific code that is being run.
Here is a customized version of the Slicehost article to enable mod_status:
Then to view the server-status: