Environment: Ubuntu 10.04 LTS, Passenger, Nginx 1.0.6, MySQL, Ruby 1.9.2, Rails 3.1
After some amount of time, the server ends up with a gradually increasing number of processes that are stuck at 100% CPU
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2393 avitus 20 0 496m 381m 1392 R 100 9.4 25:10.74 Rack: /home/web ...
Running a strace on any of the stuck PID's gives the following:
Process 2393 attached with 3 threads - interrupt to quit
[pid 2396] futex(0x8ca80e4, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2394] restart_syscall(<... resuming interrupted call ...>) = -1 ETIMEDOUT (Connection timed out)
[pid 2394] gettimeofday({1322590778, 346573}, NULL) = 0
[pid 2394] futex(0x821db60, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 2394] clock_gettime(CLOCK_REALTIME, {1322590778, 346885177}) = 0
[pid 2394] futex(0x821db84, FUTEX_WAIT_PRIVATE, 33872659, {0, 9687823}) = -1 ETIMEDOUT (Connection timed out)
[pid 2394] gettimeofday({1322590778, 356921}, NULL) = 0
[pid 2394] futex(0x821db60, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 2394] clock_gettime(CLOCK_REALTIME, {1322590778, 357196244}) = 0
[pid 2394] futex(0x821db84, FUTEX_WAIT_PRIVATE, 33872661, {0, 9724756}) = -1 ETIMEDOUT (Connection timed out)
[pid 2394] gettimeofday({1322590778, 367240}, NULL) = 0
[pid 2394] futex(0x821db60, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 2394] clock_gettime(CLOCK_REALTIME, {1322590778, 367459723}) = 0
[pid 2394] futex(0x821db84, FUTEX_WAIT_PRIVATE, 33872663, {0, 9780277}) = -1 ETIMEDOUT (Connection timed out)
[pid 2394] gettimeofday({1322590778, 377586}, NULL) = 0
[pid 2394] futex(0x821db60, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 2394] clock_gettime(CLOCK_REALTIME, {1322590778, 377807840}) = 0
[pid 2394] futex(0x821db84, FUTEX_WAIT_PRIVATE, 33872665, {0, 9778160}) = -1 ETIMEDOUT (Connection timed out)
[pid 2394] gettimeofday({1322590778, 387932}, NULL) = 0
[pid 2394] futex(0x821db60, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 2394] clock_gettime(CLOCK_REALTIME, {1322590778, 388162450}) = 0
[pid 2394] futex(0x821db84, FUTEX_WAIT_PRIVATE, 33872667, {0, 9769550}) = -1 ETIMEDOUT (Connection timed out)
Including the 'c' flag for strace gives:
Process 2393 attached with 3 threads - interrupt to quit
Process 2393 detached Process 2394 detached
Process 2396 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
94.97 0.003172 2 1489 744 futex
3.74 0.000125 0 745 clock_gettime
1.29 0.000043 0 745 gettimeofday
0.00 0.000000 0 1 1 restart_syscall
------ ----------- ----------- --------- --------- ----------------
100.00 0.003340 2980 745 total
I can kill -9 the stuck processes and the application and server appear to carry on happily. I've run out of ideas on how to proceed with debugging so if anyone has any advice as to the cause or other avenues of investigation it would be great to hear.
Best Answer
Try setting passenger_spawn_method to conservative in Passenger. I'm having this issue with Mongo and came across:
http://code.google.com/p/phusion-passenger/issues/detail?id=684
and:
https://github.com/rails/rails/issues/1339
I don't know why it's not working, but hopefully that will get you going if you haven't figured out the solution already.