Nginx – Why is ab erroring with: apr_socket_recv: Connection reset by peer (54) on OSX

abapache-2.2haproxynginx

I'm trying to use ab to benchmark a cluster of 4 ubuntu boxes running nginx that are load balanced by another ubuntu box running haproxy. For those interested I'm following along with the sysadmincasts tutorial series. I'm not able to successfully benchmark the cluster from my laptop running osx:

➜  ~ uname -a
Darwin mbp 15.6.0 Darwin Kernel Version 15.6.0: Mon Aug 29 20:21:34 PDT 2016; root:xnu-3248.60.11~1/RELEASE_X86_64 x86_64
➜  ~ system_profiler SPSoftwareDataType
Software:

    System Software Overview:

      System Version: OS X 10.11.6 (15G1004)
      Kernel Version: Darwin 15.6.0
      Boot Volume: Untitled
      Boot Mode: Normal
      Computer Name: Max’s MacBook Pro
      User Name: Max Bigras (max)
      Secure Virtual Memory: Enabled
      System Integrity Protection: Disabled
      Time since boot: 4 days 6:27
➜  ~ ab -n 10000 -c 25 http://127.0.0.1:8080/
This is ApacheBench, Version 2.3 <$Revision: 1706008 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient)
apr_socket_recv: Connection reset by peer (54)

I found another post that mentioned the possibility of problem with socket limits in osx. However when I tried the solution mentioned it still didn't work:

➜  ~ sysctl kern.maxfiles
kern.maxfiles: 1048600
➜  ~ ulimit -S -n
1048576
➜  ~ ab -n 10000 -c 25 http://127.0.0.1:8080/
This is ApacheBench, Version 2.3 <$Revision: 1706008 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient)
apr_socket_recv: Connection reset by peer (54)
Total of 25 requests completed

I know I'm hitting my servers at least a little bit because I checked the log files for one of the web nodes running nginx

sshing into web1 (an ubuntu box running nginx):

vagrant@web1:~$ echo $USER
vagrant
vagrant@web1:~$ sudo tail -f /var/log/nginx/access.log
10.0.15.11 - - [02/Oct/2016:06:53:25 +0000] "GET / HTTP/1.0" 200 632 "-" "ApacheBench/2.3"
10.0.15.11 - - [02/Oct/2016:06:53:25 +0000] "GET / HTTP/1.0" 200 632 "-" "ApacheBench/2.3"
10.0.15.11 - - [02/Oct/2016:06:53:25 +0000] "GET / HTTP/1.0" 200 632 "-" "ApacheBench/2.3"
10.0.15.11 - - [02/Oct/2016:06:53:25 +0000] "GET / HTTP/1.0" 200 632 "-" "ApacheBench/2.3"
10.0.15.11 - - [02/Oct/2016:06:53:25 +0000] "GET / HTTP/1.0" 200 632 "-" "ApacheBench/2.3"
10.0.15.11 - - [02/Oct/2016:06:53:25 +0000] "GET / HTTP/1.0" 200 632 "-" "ApacheBench/2.3"
10.0.15.11 - - [02/Oct/2016:06:53:25 +0000] "GET / HTTP/1.0" 200 632 "-" "ApacheBench/2.3"
10.0.15.11 - - [02/Oct/2016:06:53:25 +0000] "GET / HTTP/1.0" 200 632 "-" "ApacheBench/2.3"
10.0.15.11 - - [02/Oct/2016:06:53:25 +0000] "GET / HTTP/1.0" 200 632 "-" "ApacheBench/2.3"
10.0.15.11 - - [02/Oct/2016:06:53:25 +0000] "GET / HTTP/1.0" 200 632 "-" "ApacheBench/2.3"
10.0.15.11 - - [02/Oct/2016:06:54:59 +0000] "GET / HTTP/1.0" 200 632 "-" "ApacheBench/2.3"
10.0.15.11 - - [02/Oct/2016:06:54:59 +0000] "GET / HTTP/1.0" 200 632 "-" "ApacheBench/2.3"
10.0.15.11 - - [02/Oct/2016:06:54:59 +0000] "GET / HTTP/1.0" 200 632 "-" "ApacheBench/2.3"
10.0.15.11 - - [02/Oct/2016:06:54:59 +0000] "GET / HTTP/1.0" 200 632 "-" "ApacheBench/2.3"
10.0.15.11 - - [02/Oct/2016:06:54:59 +0000] "GET / HTTP/1.0" 200 632 "-" "ApacheBench/2.3"
10.0.15.11 - - [02/Oct/2016:06:54:59 +0000] "GET / HTTP/1.0" 200 632 "-" "ApacheBench/2.3"
10.0.15.11 - - [02/Oct/2016:06:54:59 +0000] "GET / HTTP/1.0" 200 632 "-" "ApacheBench/2.3"
10.0.15.11 - - [02/Oct/2016:06:54:59 +0000] "GET / HTTP/1.0" 200 632 "-" "ApacheBench/2.3"
10.0.15.11 - - [02/Oct/2016:06:55:09 +0000] "GET / HTTP/1.1" 200 632 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.116 Safari/537.36"

trying ab again on osx:

➜  ~ echo $USER
max
➜  ~ date
Sat Oct  1 23:54:54 PDT 2016
➜  ~ ab -n 10000 -c 25 http://127.0.0.1:8080/
This is ApacheBench, Version 2.3 <$Revision: 1706008 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient)
apr_socket_recv: Connection reset by peer (54)
Total of 41 requests completed
➜  ~ date
Sat Oct  1 23:55:02 PDT 2016
➜  ~

Above you can see that in addition to trying to benchmark the cluster I also hit it with my browser, as can be seen at the bottom on /var/log/nginx/access.log. It definitely seems like the problem has to do there be a limit on the number of requests I can make. If I lower the number of requests and the concurrency value then it works:

➜  ~ ab -n 100 -c 1 http://127.0.0.1:8080/
This is ApacheBench, Version 2.3 <$Revision: 1706008 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient).....done


Server Software:        nginx
Server Hostname:        127.0.0.1
Server Port:            8080

Document Path:          /
Document Length:        632 bytes

Concurrency Level:      1
Time taken for tests:   0.218 seconds
Complete requests:      100
Failed requests:        0
Total transferred:      83800 bytes
HTML transferred:       63200 bytes
Requests per second:    458.17 [#/sec] (mean)
Time per request:       2.183 [ms] (mean)
Time per request:       2.183 [ms] (mean, across all concurrent requests)
Transfer rate:          374.95 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.3      0       2
Processing:     1    2   0.8      2       5
Waiting:        1    2   0.7      1       5
Total:          1    2   0.9      2       6
WARNING: The median and mean for the waiting time are not within a normal deviation
        These results are probably not that reliable.

Percentage of the requests served within a certain time (ms)
  50%      2
  66%      2
  75%      2
  80%      2
  90%      3
  95%      4
  98%      6
  99%      6
 100%      6 (longest request)

So how do I configure osx so I can test for 10000 requests with a concurrency of 25 using ab?

Edit: Add nginx.conf

vagrant@web1:~$ cat /etc/nginx/nginx.conf
# Ansible managed: /home/vagrant/templates/nginx.conf.j2 modified on 2016-09-11 14:17:19 by vagrant on mgmt
user              www-data;

worker_processes  1;
pid        /var/run/nginx.pid;
worker_rlimit_nofile 1024;

events {
    worker_connections  512;
}


http {

        include /etc/nginx/mime.types;
        default_type application/octet-stream;
        tcp_nopush "on";
        tcp_nodelay "on";
        #keepalive_timeout "65";
        access_log "/var/log/nginx/access.log";
        error_log "/var/log/nginx/error.log";
        server_tokens off;
        types_hash_max_size 2048;

        # https://philio.me/backend-server-host-name-as-a-custom-header-with-nginx/
        add_header X-Backend-Server $hostname;

        # disable cache used for testing
        add_header Cache-Control private;
        add_header Last-Modified "";
        sendfile off;
        expires off;
        etag off;

        include /etc/nginx/conf.d/*.conf;
        include /etc/nginx/sites-enabled/*;
}

Best Answer

The Connection reset by peer error means that your web server closed the connection before responding to your request. So, the load is too big for your server and it starts to drop connections.

You need to study your server setup to see why this happens. Study your log files.

This error doesn't happen because of your client side.