My MySQL slave is spending a lot of time in Slave_SQL_Running_State: System lock
. I can see that the system is currently I/O write bound, and that it is processing the log, although slowly. Show processlist
doesn't show anything other than "Waiting for master to send event" and "System lock" when it is in this state.
All my tables (other than the system tables) are InnoDB, and external locking is disabled. What is the slave doing in this state?
Here's some info that has been requested:
First, this is MySQL 5.6 community on an Amazon EC2 instance, with all storage on EBS.
mysql> show processlist;
+----+-------------+-----------+---------------+---------+--------+----------------------------------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+-------------+-----------+---------------+---------+--------+----------------------------------+------------------+
| 1 | system user | | NULL | Connect | 26115 | Waiting for master to send event | NULL |
| 2 | system user | | NULL | Connect | 402264 | System lock | NULL |
| 14 | readonly | localhost | theshadestore | Query | 0 | init | show processlist |
+----+-------------+-----------+---------------+---------+--------+----------------------------------+------------------+
3 rows in set (0.00 sec)
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 184.106.16.14
Master_User: replicant
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: bin-log.000764
Read_Master_Log_Pos: 505452667
Relay_Log_File: relay-log.000197
Relay_Log_Pos: 345413863
Relay_Master_Log_File: bin-log.000746
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 345413702
Relay_Log_Space: 19834085375
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 402263
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 307009
Master_UUID: b1bf9a19-dac0-11e2-8ffa-b8ca3a5bce90
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: System lock
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
1 row in set (0.00 sec)
Best Answer
Databases running on distributed storage facepalm. I would benchmark the filesystem running on top of the EC2 EBS storage system. Probably the simplest method is to use something like
s=$(date +%s); dd if=/dev/zero of=<database-dir> bs=1M count=512; e=$(date +%s); echo "scale=4; 512 / ( $e - $s )" | bc
. That assumes you have 512 MB to spare. Now, the problem with this benchmarking is that (1) it doesn't take into account caching effects, and (2) the resolution is not very fine. But if this test is slow, then the problem is definitely with EC2 EBS. If the test is fast or nominal, we have to dig further and use more sophisticated techniques.The bonnie++ program is somewhat adequate, but it doesn't (AFAIK) flush the OS buffers between the write and read. Still, you should get an idea with something like
bonnie++ -u mysql -r 8 -s 16:512 -n 1 -b -d <mysql-data-directory>
. When I do this on a VM running on local storage, I get:Here is what I get when running on a VM over NFS: