MySQL high CPU/query time usage during concurrent inserts

mariadbMySQLubuntu-14.04

I just setup a new server for a large MySQL database of about 70GB.
This database is regularly written by automatic processes that must write as quickly as they can their data.
Before we had a server with 120GB SSD but we switched to HDD because the amount of data is getting bigger and bigger.

The problem is that when the processes are running, the CPU spikes to more than 150% and write operations become deadly slow…

The server has a 4 cores – 8 thread CPU, 32GB RAM and 2x2TB HDD with a LSI 2108 RAID controller (RAID 1).
MariaDB 10.0 is the only server running on this machine.
OS is Ubuntu 14.04 freshly installed.
It has a slave server slightly less powerful, that's why I enabled binlogs.

I tuned InnoDB setup like that :

query_cache_type        = OFF
tmp_table_size          = 1G
max_heap_table_size     = 1G
transaction-isolation   = READ-COMMITTED
binlog_format           = row
innodb_log_file_size    = 6G
innodb_buffer_pool_size = 24G
innodb_log_buffer_size  = 8M
innodb_file_per_table   = 1
innodb_open_files       = 400
innodb_io_capacity      = 300
innodb_io_capacity_max  = 400
innodb_flush_method     = O_DIRECT
innodb_flush_log_at_trx_commit = 0
innodb_lock_wait_timeout= 240
innodb_use_fallocate = 1
innodb_random_read_ahead = 1
innodb_flush_neighbors = 0
innodb_checksum_algorithm = crc32
innodb_fast_shutdown    = 0
skip-innodb_doublewrite

While the processes are running, the slow log are full of these lines (highlighted query is random, could be insert, update or delete on any table) :

# User@Host: user_prod[user_prod] @ xxxxx [xxx.xxx.xxx.xxx]
# Thread_id: 177018  Schema: user_prod  QC_hit: No
# Query_time: 18.318539  Lock_time: 0.000026  Rows_sent: 0  Rows_examined: 1
SET timestamp=1413450644;
update `pages_objects` set `status_comments` = 'idle', `updated_at` = '2014-10-16 09:10:57' where `id` = '331667763652878';

I'm stuck and don't find any help on Google…
Do you have any idea where the problem could come ?
Thanks 🙂

Edit : A sample of processlist while my CPU is spiking to 250% (hell yeah !) :

+--------+--------------+---------------------------------+--------------+---------+------+----------------+------------------------------------------------------------------------------------------------------+----------+
| Id     | User         | Host                            | db           | Command | Time | State          | Info                                                                                                 | Progress |
+--------+--------------+---------------------------------+--------------+---------+------+----------------+--------------------------------    ----------------------------------------------------------------------+----------+
|    378 | user_prod | server.ip:46542 | user_prod | Sleep   |    0 |                | NULL                                                                                                 |    0.000 |
|   2985 | user_prod | server.ip:60257 | user_prod | Sleep   |    0 |                | NULL                                                                                                 |    0.000 |
|   4001 | user_prod | server.ip:38046 | user_prod | Execute |    0 | preparing      | select * from `pages_users` where `user_id` = '1247143319' and `page_id` = '169449309753828' limit 1 |    0.000 |
|   6533 | user_prod | server.ip:54548 | user_prod | Sleep   |    0 |                | NULL                                                                                                 |    0.000 |
|   7582 | user_prod | server.ip:59995 | user_prod | Sleep   |    0 |                | NULL                                                                                                 |    0.000 |
|  13179 | user_prod | server.ip:33221 | user_prod | Sleep   |    0 |                | NULL                                                                                                 |    0.000 |
|  14624 | user_prod | server.ip:41004 | user_prod | Execute |    0 | Writing to net | select * from `pages_users` where `user_id` = '100000010909375' and `page_id` = '476930419093906' li |    0.000 |
|  54642 | user_prod | server.ip:45540 | user_prod | Execute |    0 | update         | insert into `pages_users` (`user_id`, `page_id`, `updated_at`, `created_at`) values ('1318873669', ' |    0.000 |
|  55244 | user_prod | server.ip:47407 | user_prod | Sleep   |    0 |                | NULL                                                                                                 |    0.000 |
|  55426 | user_prod | server.ip:47983 | user_prod | Sleep   |    0 |                | NULL                                                                                                 |    0.000 |
| 107408 | user_prod | server.ip:57303 | user_prod | Sleep   |    0 |                | NULL                                                                                                 |    0.000 |
| 204661 | user_prod | server.ip:45568 | user_prod | Sleep   |    0 |                | NULL                                                                                                 |    0.000 |
| 204717 | user_prod | server.ip:51573 | user_prod | Sleep   |    0 |                | NULL                                                                                                 |    0.000 |
| 204795 | user_prod | server.ip:52682 | user_prod | Sleep   |    0 |                | NULL                                                                                                 |    0.000 |
| 204844 | user_prod | server.ip:53290 | user_prod | Sleep   |    0 |                | NULL                                                                                                 |    0.000 |
| 204972 | user_prod | server.ip:54717 | user_prod | Sleep   |   20 |                | NULL                                                                                                 |    0.000 |
| 204999 | user_prod | server.ip:55069 | user_prod | Sleep   |   13 |                | NULL                                                                                                 |    0.000 |
| 205006 | user_prod | server.ip:55159 | user_prod | Sleep   |   11 |                | NULL                                                                                                 |    0.000 |
| 205020 | user_prod | server.ip:55377 | user_prod | Sleep   |    7 |                | NULL                                                                                                 |    0.000 |
| 205026 | user_prod | server.ip:55443 | user_prod | Sleep   |    5 |                | NULL                                                                                                 |    0.000 |
| 205028 | user_prod | server.ip:55524 | user_prod | Sleep   |    3 |                | NULL                                                                                                 |    0.000 |
| 205031 | user_prod | server.ip:55569 | user_prod | Sleep   |    2 |                | NULL                                                                                                 |    0.000 |
| 205032 | user_prod | server.ip:55573 | user_prod | Sleep   |    2 |                | NULL                                                                                                 |    0.000 |
| 205034 | user_prod | localhost                       | NULL         | Query   |    0 | init           | show processlist                                                                                     |    0.000 |
+--------+--------------+---------------------------------+--------------+---------+------+----------------+------------------------------------------------------------------------------------------------------+----------+
24 rows in set (0.00 sec)

Best Answer

I would investigate if table_open_cache limit isn't the bottleneck reason. Perhaps the threads are simply blocking, causing long execution time and high CPU.

Also, enabling the query cache can possibly help, like switching query_cache_type to ON and setting the query_cache_size to 1G.

If that helps, you could also try to play with it's amount, because different values can give different result, even "more is better" rule sometimes doesn't apply.