NEW (pcie) server:
Intel(R) Xeon(R) CPU E5-2637 v4 @ 3.50GHz, 1TB NVMe disks, 128 GB RAM, installed Debian 4.9.65-3+deb9u1, Ver 15.1 Distrib 10.1.26-MariaDB
moved binary db files from
OLD server:
Intel(R) Xeon(R) CPU E5-1630 v3 @ 3.70GHz, SSD disk, 64 GB RAM, FreeBSD 11.0-STABLE, 10.1.21-MariaDB
On servers is running just mysql, I copy my.ini file, config files are same.
Run mysqlslap benchmark (always restarted server before doing each test):
root@db1:/tmp # mysqlslap --user=root --query=/tmp/slap2.sql --create-schema=mydatabase --concurrency=1 --iterations=1
Benchmark
Average number of seconds to run all queries: 59.573 seconds
Minimum number of seconds to run all queries: 59.573 seconds
Maximum number of seconds to run all queries: 59.573 seconds
Number of clients running queries: 1
Average number of queries per client: 100000
root@pcie:~# mysqlslap --user=root --query=/tmp/slap2.sql --create-schema=mydatabase --concurrency=1 --iterations=1
Benchmark
Average number of seconds to run all queries: 31.151 seconds
Minimum number of seconds to run all queries: 31.151 seconds
Maximum number of seconds to run all queries: 31.151 seconds
Number of clients running queries: 1
Average number of queries per client: 100000
====================================================================================================================================
root@db1:/tmp # mysqlslap --user=root --query=/tmp/slap2.sql --create-schema=mydatabase --concurrency=100 --iterations=1
Benchmark
Average number of seconds to run all queries: 568.082 seconds
Minimum number of seconds to run all queries: 568.082 seconds
Maximum number of seconds to run all queries: 568.082 seconds
Number of clients running queries: 100
Average number of queries per client: 100000
root@pcie:/etc/security/limits.d# mysqlslap --user=root --query=/tmp/slap2.sql --create-schema=mydatabase --concurrency=100 --iterations=1
Benchmark
Average number of seconds to run all queries: 2059.712 seconds
Minimum number of seconds to run all queries: 2059.712 seconds
Maximum number of seconds to run all queries: 2059.712 seconds
Number of clients running queries: 100
Average number of queries per client: 100000
====================================================================================================================================
root@db1:/tmp # mysqlslap --user=root --query=/tmp/slap2.sql --create-schema=mydatabase --concurrency=8 --iterations=1
Benchmark
Average number of seconds to run all queries: 134.003 seconds
Minimum number of seconds to run all queries: 134.003 seconds
Maximum number of seconds to run all queries: 134.003 seconds
Number of clients running queries: 8
Average number of queries per client: 100000
root@pcie:/etc/security/limits.d# mysqlslap --user=root --query=/tmp/slap2.sql --create-schema=mydatabase --concurrency=8 --iterations=1
Benchmark
Average number of seconds to run all queries: 133.410 seconds
Minimum number of seconds to run all queries: 133.410 seconds
Maximum number of seconds to run all queries: 133.410 seconds
Number of clients running queries: 8
Average number of queries per client: 100000
As you can see, NEW (pcie) server is performing very good when running concurrency=1, performance is same when concurrency=8, and performance is very bad when concurrency=100.
Here are interesting results using internal benchmark:
root@pcie:~/slap/employees_db# mysqlslap --auto-generate-sql --concurrency=8 --iterations=500 --verbose
Average number of seconds to run all queries: 0.002 seconds
DB1: Average number of seconds to run all queries: 0.002 seconds
root@pcie:~/slap/employees_db# mysqlslap --auto-generate-sql --concurrency=16 --iterations=500
Average number of seconds to run all queries: 0.007 seconds
DB1: Average number of seconds to run all queries: 0.005 seconds
root@pcie:~/slap/employees_db# mysqlslap --auto-generate-sql --concurrency=32 --iterations=500
Average number of seconds to run all queries: 0.015 seconds
DB1: Average number of seconds to run all queries: 0.011 seconds
root@pcie:~/slap/employees_db# mysqlslap --auto-generate-sql --concurrency=64 --iterations=500
Average number of seconds to run all queries: 0.033 seconds
DB1: Average number of seconds to run all queries: 0.029 seconds
root@pcie:~/slap/employees_db# mysqlslap --auto-generate-sql --concurrency=128 --iterations=500
Average number of seconds to run all queries: 0.074 seconds
DB1: Average number of seconds to run all queries: 0.097 seconds
root@pcie:~/slap/employees_db# mysqlslap --auto-generate-sql --concurrency=256 --iterations=500
Average number of seconds to run all queries: 0.197 seconds
DB1: Average number of seconds to run all queries: 0.293 seconds
root@pcie:~/slap/employees_db# mysqlslap --auto-generate-sql --concurrency=512 --iterations=500
Average number of seconds to run all queries: 0.587 seconds
DB1: Average number of seconds to run all queries: 1.009 seconds
Internal mysqlsap benchmark is too synthetic, so I load employees db:
https://launchpad.net/test-db/employees-db-1/1.0.6/+download/employees_db-full-1.0.6.tar.bz2
SQL:
#less /root/slap/select_query.sql
SELECT emp_no, first_name, last_name, gender FROM employees LIMIT 10;
SELECT emp_no, first_name, last_name, gender FROM employees ORDER BY last_name ASC LIMIT 10;
SELECT COUNT(emp_no) FROM employees WHERE last_name = 'Aamodt';
SELECT last_name, COUNT(emp_no) AS num_emp FROM employees GROUP BY last_name ORDER BY num_emp DESC LIMIT 10;
SELECT employees.* FROM employees LEFT JOIN dept_emp ON ( dept_emp.emp_no = employees.emp_no ) LEFT JOIN salaries ON ( salaries.emp_no = salaries.emp_no ) WHERE employees.first_name LIKE '%Jo%' AND salaries.from_date > '1993-01-21' AND salaries.to_date < '1998-01-01' LIMIT 0, 100;
Results:
root@pcie:~/slap# mysqlslap --pre-query="RESET QUERY CACHE;" --create-schema=employees --query="/root/slap/select_query.sql" --iterations=10 --concurrency=1
Average number of seconds to run all queries: 0.459 seconds
DB1: Average number of seconds to run all queries: 0.627 seconds
root@pcie:~/slap# mysqlslap --pre-query="RESET QUERY CACHE;" --create-schema=employees --query="/root/slap/select_query.sql" --iterations=10 --concurrency=2
Benchmark
Average number of seconds to run all queries: 0.473 seconds
DB1: Average number of seconds to run all queries: 0.626 seconds
root@pcie:~/slap# mysqlslap --pre-query="RESET QUERY CACHE;" --create-schema=employees --query="/root/slap/select_query.sql" --iterations=10 --concurrency=4
Average number of seconds to run all queries: 0.486 seconds
DB1: Average number of seconds to run all queries: 0.656 seconds
root@pcie:~/slap# mysqlslap --pre-query="RESET QUERY CACHE;" --create-schema=employees --query="/root/slap/select_query.sql" --iterations=10 --concurrency=8
Average number of seconds to run all queries: 0.569 seconds
DB1: Average number of seconds to run all queries: 1.136 seconds
root@pcie:~/slap# mysqlslap --pre-query="RESET QUERY CACHE;" --create-schema=employees --query="/root/slap/select_query.sql" --iterations=10 --concurrency=16
Benchmark
Average number of seconds to run all queries: 0.948 seconds
DB1: Average number of seconds to run all queries: 1.750 seconds
root@pcie:~/slap# mysqlslap --pre-query="RESET QUERY CACHE;" --create-schema=employees --query="/root/slap/select_query.sql" --iterations=10 --concurrency=32
Average number of seconds to run all queries: 1.650 seconds
DB1: Average number of seconds to run all queries: 2.455 seconds
root@pcie:~/slap# mysqlslap --pre-query="RESET QUERY CACHE;" --create-schema=employees --query="/root/slap/select_query.sql" --iterations=10 --concurrency=64
Average number of seconds to run all queries: 3.306 seconds
DB1: Average number of seconds to run all queries: 3.176 seconds
root@pcie:~/slap# mysqlslap --pre-query="RESET QUERY CACHE;" --create-schema=employees --query="/root/slap/select_query.sql" --iterations=10 --concurrency=128
Average number of seconds to run all queries: 6.744 seconds
DB1: Average number of seconds to run all queries: 5.737 seconds
root@pcie:~/slap# mysqlslap --pre-query="RESET QUERY CACHE;" --create-schema=employees --query="/root/slap/select_query.sql" --iterations=10 --concurrency=256
Average number of seconds to run all queries: 13.474 seconds (verified 2nd run: 12.883 seconds)
DB1: Average number of seconds to run all queries: 3.451 seconds (verified 2nd run: 4.935 seconds)
root@pcie:~/slap# mysqlslap --pre-query="RESET QUERY CACHE;" --create-schema=employees --query="/root/slap/select_query.sql" --iterations=10 --concurrency=512
Average number of seconds to run all queries: 26.085 seconds (verified 2nd run: 26.307 seconds)
DB1: Average number of seconds to run all queries: 15.862 seconds (verified 2nd run: 11.280 seconds)
with 512 concurrency, QUERY CACHE disabled:
OLD db1 server: Average number of seconds to run all queries: 72.710s
NEW PCIE server: Average number of seconds to run all queries: 29.774s
Anybody have idea what to check, how to optimize the setup? I am using MyISAM tables only in my DB, mariadb config is same on both servers…
Update with more info: Initially I installed on NEW DB server FREEBSD, MariaDB performance was bad, I thought it is OS related problem, but same symptoms are on Linux. During benchmark there is basically no IO after filling the cache, so this is not IO related problem.
Thanks for any ideas.
Best Answer
I am going to answer my own question. I spent couple of days solving this problem. Problem was NUMA. But let's peek into this more.
For understanding the problem we need CPU information:
Important is NUMA node(s): 2. The server got TWO CPUs, on TWO nodes. Read more about NUMA, read more about NUMA and mysql based databases.
To speed up the queries we need to run MariaDB using numactl. I made some benchmarks, and the best setting was
Overall the Query Cache is not scaling well on multiple threads, so some DBA suggest to turn it off.
Benchmark results (QC is Query Cache), MyISAM tables, concurrency 64 on my database using mysqlslap, simplified output, lower is better:
According to these results the best is specify "numactl --cpunodebind=1 --membind=1" and still have Query Cache turned on. I tried kernel with noibrs noibpb nopti flags, speed gain is quite low, only 2%. What surprised me is bad results with numactl --interleave all
So, if you experience strange slow results on new more powerful server, be sure you understand NUMA, it can save a lot of time for debugging.
Script to run benchmark: