MySql server master-master seconds_behind_master jumps

MySQLmysql-replicationreplication

I'm running a 4 servers master-master cluster of MySql. (2 servers version 5.1, and 2 version 5.5)

While checking the slave status, i see the seconds_behind_master at 0, and half a second after i see it jumps to 2000, and so fourth.

What could it possibly be? How can i debug it?

Replication topology:
1 -> 2 -> 3 -> 4 -> 1

UPDATE

It seems that server 3 has its SBM at 0, while the other servers are jumping up and down.
Does that help?

UPDATE 2
It seems that the issue is with server 1.
When creating a test table in server 4, checking the relay log in server 1 shows the create statement was copied to the relay log in server 1 instantly, but the table is not created.
It looks like the server is busy doing something, and there is a huge delay between when the server gets the statement, and when it executes it.

UPDATE 3
The same thing happens on server 4.

UPDATE 4
Ok i found the issue.
Servers 1 2 & 4 were having "invalidating query cache entries (table)" stuck in their replication thread.
After disabling cache, server 4 is ok but 1&2 are still having this issue.

It looks like a common bug:
http://bugs.mysql.com/bug.php?id=60696

If anyone knows how to fix it, i would be glad to hear

Best Answer

There is one flaw with mysql's seconds_behind_master value: it only takes into account the position relative to one upstream hop away. Easiest demonstrated with a slightly simpler replication topology:

server1 -> server2 -> server3

If server2 falls behind, and is processing some long-running queries, the following will happen, assuming 00:00 as start point:

00:00: Everyone ok
00:01: server1 writes two 10-minute queries to the binlog, no replication delay anywhere
00:02: server2 starts processing query one. Replication delay for server2 starts growing, replication delay for server3 stays zero
10:02: server2 is done with query one, starts processing query two. server2 replication delay is growing still. server3 replication delay suddenly jumps to 10 minutes.
20:02: server2 is done with query 2, replication delay zero again. Server3 will be done with query 3, replication delay jumps back to zero, and then back up to 10 as it processes the next query.

So, the jumpy behaviour is caused by not using a global timestamp for replication delay, but simply the delay behind the last "hop" in the replication chain. We found this severely annoying and now use MySQL's event scheduler to update a timer table on each master every second, so we can actually see actual delay from the global master (in a non-ring topology) or delay from any peer in a ring.