This is my nginx.conf
(I've updated config to make sure that there is no PHP involved or any other bottlenecks):
user nginx;
worker_processes 4;
worker_rlimit_nofile 10240;
pid /var/run/nginx.pid;
events
{
worker_connections 1024;
}
http
{
include /etc/nginx/mime.types;
error_log /var/www/log/nginx_errors.log warn;
port_in_redirect off;
server_tokens off;
sendfile on;
gzip on;
client_max_body_size 200M;
map $scheme $php_https { default off; https on; }
index index.php;
client_body_timeout 60;
client_header_timeout 60;
keepalive_timeout 60 60;
send_timeout 60;
server
{
server_name dev.anuary.com;
root "/var/www/virtualhosts/dev.anuary.com";
}
}
I am using http://blitz.io/play to test my server (I bought the 10 000 concurrent connections plan). In a 30 seconds run, I get 964
hits and 5,587 timeouts
. The first timeout happened at 40.77 seconds into the test when the number of concurrent users was at 200.
During the test, the server load was (top
output):
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 20225 nginx 20 0 48140 6248 1672 S 16.0 0.0 0:21.68 nginx
1 root 20 0 19112 1444 1180 S 0.0 0.0 0:02.37 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
3 root RT 0 0 0 0 S 0.0 0.0 0:00.03 migration/0
Therefore it is not server resource issue. What is it then?
UPDATE 2011 12 09 GMT 17:36.
So far I did the following changes to make sure that the bottleneck is not TCP/IP. Added to /etc/sysctl.conf
:
# These ensure that TIME_WAIT ports either get reused or closed fast.
net.ipv4.tcp_fin_timeout = 1
net.ipv4.tcp_tw_recycle = 1
# TCP memory
net.core.rmem_max = 16777216
net.core.rmem_default = 16777216
net.core.netdev_max_backlog = 262144
net.core.somaxconn = 4096
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_orphans = 262144
net.ipv4.tcp_max_syn_backlog = 262144
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syn_retries = 2
Some more debug info:
[root@server node]# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 126767
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
NB That worker_rlimit_nofile
is set to 10240
nginx config.
UPDATE 2011 12 09 GMT 19:02.
It looks like the more changes I do, the worse it gets, but here the new config file.
user nginx;
worker_processes 4;
worker_rlimit_nofile 10240;
pid /var/run/nginx.pid;
events
{
worker_connections 2048;
#1,353 hits, 2,751 timeouts, 72 errors - Bummer. Try again?
#1,408 hits, 2,727 timeouts - Maybe you should increase the timeout?
}
http
{
include /etc/nginx/mime.types;
error_log /var/www/log/nginx_errors.log warn;
# http://blog.martinfjordvald.com/2011/04/optimizing-nginx-for-high-traffic-loads/
access_log off;
open_file_cache max=1000;
open_file_cache_valid 30s;
client_body_buffer_size 10M;
client_max_body_size 200M;
proxy_buffers 256 4k;
fastcgi_buffers 256 4k;
keepalive_timeout 15 15;
client_body_timeout 60;
client_header_timeout 60;
send_timeout 60;
port_in_redirect off;
server_tokens off;
sendfile on;
gzip on;
gzip_buffers 256 4k;
gzip_comp_level 5;
gzip_disable "msie6";
map $scheme $php_https { default off; https on; }
index index.php;
server
{
server_name ~^www\.(?P<domain>.+);
rewrite ^ $scheme://$domain$request_uri? permanent;
}
include /etc/nginx/conf.d/virtual.conf;
}
UPDATE 2011 12 11 GMT 20:11.
This is output of netstat -ntla
during the test.
https://gist.github.com/d74750cceba4d08668ea
UPDATE 2011 12 12 GMT 10:54.
Just to clarify, the iptables
(firewall) is off while testing.
UPDATE 2011 12 12 GMT 22:47.
This is the sysctl -p | grep mem
dump.
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 68719476736
kernel.shmall = 4294967296
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_keepalive_time = 30
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_mem = 8388608 8388608 8388608
net.ipv4.tcp_rmem = 4096 87380 8388608
net.ipv4.tcp_wmem = 4096 65536 8388608
net.ipv4.route.flush = 1
net.ipv4.ip_local_port_range = 1024 65000
net.core.rmem_max = 16777216
net.core.rmem_default = 16777216
net.core.wmem_max = 8388608
net.core.wmem_default = 65536
net.core.netdev_max_backlog = 262144
net.core.somaxconn = 4096
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_orphans = 262144
net.ipv4.tcp_max_syn_backlog = 262144
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syn_retries = 2
UPDATE 2011 12 12 GMT 22:49
I am using blitz.io
to run all the tests. The URL I am testing is http://dev.anuary.com/test.txt, using the following command: --region ireland --pattern 200-250:30 -T 1000 http://dev.anuary.com/test.txt
UPDATE 2011 12 13 GMT 13:33
nginx
user limits (set in /etc/security/limits.conf
).
nginx hard nofile 40000
nginx soft nofile 40000
Best Answer
You will need to dump your network connections during the test. While the server may have near zero load, your TCP/IP stack could be billing up. Look for TIME_WAIT connections in a netstat output.
If this is the case, then you will want to check into tuning tcp/ip kernel paramters relating to TCP Wait states, TCP recyling, and similar metrics.
Also, you have not described what is being tested.
I always test:
This may not apply in your case but is something I do when performance testing. Testing different types of files can help you pinpoint the bottlneck.
Even with static content, testing different size of files is important as well to get timeouts and other metrics dialed in.
We have some static content Nginx boxes handling 3000+ active connections. So it Nginx can certainly do it.
Update: Your netstat shows a lot of open connections. May want to try tuning your TCP/IP stack. Also, what file are you requesting? Nginx should quickly close the port.
Here is a suggestion for sysctl.conf:
These values are very low but I have had success with them on high concurrency Nginx boxes.