We have a MySQL 5.0.77
Master-Slave replication. The replication was not properly running for the past few weeks and it was giving Duplicate entry error 1062
. The Set Global Skip-counter
option didn't help, so I had to skip the error no.1062
by adding it to the /etc/my.cnf
file and then it reported table doesn't exist error in one particular database.
I have then taken a mysqldump
of that database and restored in Slave last weekend. Then the Slave IO_Thread
and Slave_SQL
both started running fine, and it looked like the replication was back on track. The Seconds_behind_master
value was very high and then it started reducing for the past 4 days.
When I checked the slave replication status today, I found that the seconds_behind_master
value is keep on increasing since morning. I stopped the slave IO_Thread
once and then the seconds_behind_master became Null. Then after I started the IO_thread
the value became the same and kept on increasing.
I see one process is running from morning
system user gss-app Connect 9535736 copy to tmp table ALTER TABLE
queue_clicksADD INDEX(
puid)
Please help me to fix this issue. Thanks.
#mysql> show slave status\G;
`*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 203.x.x.x
Master_User: replication
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000990
Read_Master_Log_Pos: 185674180
Relay_Log_File: mysqld-relay-bin.000224
Relay_Log_Pos: 9286354
Relay_Master_Log_File: mysql-bin.000774
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 472142385
Relay_Log_Space: 112995681998
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 9533355
1 row in set (0.00 sec)`
Best Answer
I won't worry about it if the IO and SQL are running, as well as the
Relay_Master_Log_File
is catching up with theMaster_Log_File
. I believe the delay is in the fact that your total relay log file is huge, approx. 105G -Relay_Log_Space: 112995681998
and considering that the Slave is at 000774 position and master is at 000990 position, there are a total of 214 binary logs of approximately 468M each (105G/214) waiting to be replayed on the slave.My advice is to keep an eye on the
Relay_Master_Log_File
and make sure that is is going up and catching up with theMaster_Log_File
. I also see that the master host is in a public IP address, is this replication taking place over a public network or slow WAN? That could be introducing a delay, that faster the link the better.