Nginx – php-fpm can’t handle concurrent connections

centos7mariadbnginxphp-fpm

I'm dealing with a problem that a server with 94GB of RAM and 24 CPU cores can't handle 40 concurrent connections. Using ab tool I'm sending 40 requests with all being sent at a time:

ab -n 40 -c 40 http://ip-address/

The first request is done fast:

Concurrency Level:      40
Time taken for tests:   1.951 seconds
Complete requests:      40
Failed requests:        0
Non-2xx responses:      40
Total transferred:      20280 bytes
HTML transferred:       0 bytes
Requests per second:    20.50 [#/sec] (mean)
Time per request:       1951.311 [ms] (mean)
Time per request:       48.783 [ms] (mean, across all concurrent requests)
Transfer rate:          10.15 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       27   73  28.7     75     120
Processing:   540 1008 129.6    998    1256
Waiting:      536 1007 129.9    998    1255
Total:        575 1081 153.1   1071    1376

Percentage of the requests served within a certain time (ms)
  50%   1071
  66%   1177
  75%   1208
  80%   1237
  90%   1261
  95%   1322
  98%   1376
  99%   1376
 100%   1376 (longest request)

But running the same command another time will result in a connection timeout. I wouldn't be able to SSH into the server for the next couple of minutes.

The page I'm sending requests to executes 760 queries which includes 748 SELECT queries. I have MariaDB max_connections set to 500 and this is PHP-FPM pool configuration:

listen = 127.0.0.1:9000
listen.allowed_clients = 127.0.0.1
listen.owner = nginx
listen.group = nginx
listen.mode = 0660
user = xxxx ; modified
group = xxxx ; modified
pm = ondemand
pm.max_children = 1000
pm.start_servers = 200
pm.min_spare_servers = 50
pm.max_spare_servers = 100
pm.max_requests = 500
request_terminate_timeout = 100s
pm.process_idle_timeout = 3s
php_admin_value[error_log] = /var/log/php-fpm/pool-error.log
php_admin_flag[log_errors] = on
php_value[session.save_handler] = files
php_value[session.save_path] = /var/lib/php/session

I set pm to ondemand so I think max_children wouldn't apply anymore.

I also had a tough time working with SELinux. May it be culprit?

Update

Output of mysqltuner.pl:

Currently running supported MySQL version 10.4.6-MariaDB
[OK] Operating on 64-bit architecture

-------- Log file Recommendations ------------------------------------------------------------------
[--] Log file: /var/lib/mysql/localhost.localdomain.err(0B)
[!!] Log file /var/lib/mysql/localhost.localdomain.err doesn't exist
[!!] Log file /var/lib/mysql/localhost.localdomain.err isn't readable.

-------- Storage Engine Statistics -----------------------------------------------------------------
[--] Status: +Aria +CSV +InnoDB +MEMORY +MRG_MyISAM +MyISAM +PERFORMANCE_SCHEMA +SEQUENCE 
[--] Data in MyISAM tables: 72.0K (Tables: 12)
[--] Data in InnoDB tables: 1.9G (Tables: 449)
[--] Data in MEMORY tables: 0B (Tables: 17)
[OK] Total fragmented tables: 0

-------- Analysis Performance Metrics --------------------------------------------------------------
[--] innodb_stats_on_metadata: OFF
[OK] No stat updates during querying INFORMATION_SCHEMA.

-------- Security Recommendations ------------------------------------------------------------------
[OK] There are no anonymous accounts for any database users
[OK] All database users have passwords assigned
[!!] There is no basic password file list!

-------- CVE Security Recommendations --------------------------------------------------------------
[--] Skipped due to --cvefile option undefined

-------- Performance Metrics -----------------------------------------------------------------------
[--] Up for: 3m 0s (15K q [85.489 qps], 401 conn, TX: 44M, RX: 4M)
[--] Reads / Writes: 99% / 1%
[--] Binary logging is disabled
[--] Physical Memory     : 94.2G
[--] Max MySQL memory    : 154.1G
[--] Other process memory: 0B
[--] Total buffers: 50.4G global + 52.7M per thread (2000 max threads)
[--] P_S Max memory usage: 867M
[--] Galera GCache Max memory usage: 0B
[OK] Maximum reached memory usage: 69.0G (73.22% of installed RAM)
[!!] Maximum possible memory usage: 154.1G (163.47% of installed RAM)
[!!] Overall possible memory usage with other process exceeded memory
[OK] Slow queries: 0% (0/15K)
[OK] Highest usage of available connections: 17% (346/2000)
[OK] Aborted connections: 0.50%  (2/401)
[OK] Query cache is disabled by default due to mutex contention on multiprocessor machines.
[OK] Sorts requiring temporary tables: 0% (0 temp sorts / 1K sorts)
[!!] Joins performed without indexes: 47
[!!] Temporary tables created on disk: 55% (519 on disk / 927 total)
[!!] Thread cache hit rate: 13% (346 created / 401 connections)
[OK] Table cache hit rate: 98% (589 open / 595 opened)
[OK] Open file limit used: 1% (81/8K)
[OK] Table locks acquired immediately: 100% (552 immediate / 552 locks)

-------- Performance schema ------------------------------------------------------------------------
[--] Memory used by P_S: 867.5M
[--] Sys schema isn't installed.

-------- ThreadPool Metrics ------------------------------------------------------------------------
[--] ThreadPool stat is enabled.
[--] Thread Pool Size: 24 thread(s).
[--] Using default value is good enough for your version (10.4.6-MariaDB)

-------- MyISAM Metrics ----------------------------------------------------------------------------
[!!] Key buffer used: 18.2% (24M used / 134M cache)
[OK] Key buffer size / total MyISAM indexes: 128.0M/23.0K

-------- InnoDB Metrics ----------------------------------------------------------------------------
[--] InnoDB is enabled.
[--] InnoDB Thread Concurrency: 0
[OK] InnoDB File per table is activated
[OK] InnoDB buffer pool / data size: 50.0G/1.9G
[OK] Ratio InnoDB log file size / InnoDB Buffer pool size: 6.0G * 2/50.0G should be equal to 25%
[OK] InnoDB buffer pool instances: 50
[--] Number of InnoDB Buffer Pool Chunk : 400 for 50 Buffer Pool Instance(s)
[OK] Innodb_buffer_pool_size aligned with Innodb_buffer_pool_chunk_size & Innodb_buffer_pool_instances
[OK] InnoDB Read buffer efficiency: 99.92% (31587444 hits/ 31612740 total)
[!!] InnoDB Write Log efficiency: 168.97% (49 hits/ 29 total)
[OK] InnoDB log waits: 0.00% (0 waits / 78 writes)

-------- AriaDB Metrics ----------------------------------------------------------------------------
[--] AriaDB is enabled.
[OK] Aria pagecache size / total Aria indexes: 128.0M/2.5M
[OK] Aria pagecache hit rate: 99.1% (59K cached / 542 reads)

-------- TokuDB Metrics ----------------------------------------------------------------------------
[--] TokuDB is disabled.

-------- XtraDB Metrics ----------------------------------------------------------------------------
[--] XtraDB is disabled.

-------- Galera Metrics ----------------------------------------------------------------------------
[--] Galera is disabled.

-------- Replication Metrics -----------------------------------------------------------------------
[--] Galera Synchronous replication: NO
[--] No replication slave(s) for this server.
[--] Binlog format: MIXED
[--] XA support enabled: ON
[--] Semi synchronous replication Master: OFF
[--] Semi synchronous replication Slave: OFF
[--] This is a standalone server

-------- Recommendations ---------------------------------------------------------------------------
General recommendations:
    MySQL was started within the last 24 hours - recommendations may be inaccurate
    Reduce your overall MySQL memory footprint for system stability
    Dedicate this server to your database for highest performance.
    Adjust your join queries to always utilize indexes
    When making adjustments, make tmp_table_size/max_heap_table_size equal
    Reduce your SELECT DISTINCT queries which have no LIMIT clause
    Consider installing Sys schema from https://github.com/mysql/mysql-sys for MySQL
    Consider installing Sys schema from https://github.com/good-dba/mariadb-sys for MariaDB
Variables to adjust:
  *** MySQL's maximum memory usage is dangerously high ***
  *** Add RAM before increasing MySQL buffer variables ***
    join_buffer_size (> 50.0M, or always use indexes with JOINs)
    tmp_table_size (> 100M)
    max_heap_table_size (> 100M)
    thread_cache_size (> 2000)

Best Answer

Benchmarking is tricky business. Some Benchmarks are good at comparing two different hardware/software configurations, but few Benchmarks are good at returning realistic estimates of absolute capability of a system.

This example seems to have slammed the server with 40 identical, CPU-bound, queries within very few milliseconds. Since there are only 24, cores, the processes quickly started stumbling over each other. Note ms timings: 575 1081 1376; this confirms the stumbling.

If the first query is allowed to run 575ms before the 25th is started, (etc), then the processing may proceed nicely 'forever'. That's about 40 per second, not 40 "all at once".

So, if ab can spread out the requests (new request every 25ms), you will probably see that there is "no problem".

Meanwhile, for web pages, 575ms is a terribly long reponse time. Let's see the query (queries) involved with an eye to speeding it up.

In real life, http requests come at the web server somewhere between "simultaneously" and "evenly spaced". Your test has provided only an upper bound.

Related Topic