Monitoring MySQL Replication – Understanding mk-heartbeat

maatkitmonitoringmysql-replication

Seconds_Behind_Master from SHOW SLAVE STATUS is considered an unreliable measure of Slave lag. mk-heartbeat is often offered as a reliable alternative.

Now mk-heartbeat does not even need the Slave to be running.

http://www.maatkit.org/doc/mk-heartbeat.html

Excerpt:

mk-heartbeat is a two-part MySQL and PostgreSQL replication delay monitoring system that doesn't require the slave to be working (in other words, it doesn't rely on SHOW SLAVE STATUS on MySQL).

So my understanding is that you create a DB/table on the Master, run mk-heartbeat with –update like so:

./mk-heartbeat -D heart --table beat -u heartbeat -p XXXXXXXXX --update -h 192.168.2.80

And then on the Slave you point mk-heartbeat at the DB/table on the Master (i.e. you do a GRANT statement on the Master to give the Slave privileges) and run with –monitor like so:

./mk-heartbeat -D heart --table beat -u heartbeat_slave -p XXXXXXXXX --monitor -h 192.168.2.80

I have done just this and even when updating over and over the 2.8M+ rows in the MySQL sample employees salaries table (which creates Slave lag, at least according to the unreliable Seconds_Behind_Master) I never see the mk-heartbeat –monitor change from:

0s [  0.00s,  0.00s,  0.00s ]

Maybe it is the case that I haven't produced enough lag and that as per the mk-heartbeat docs the replication events are propagating in less than half a second and I can expect to see zero seconds of delay:

mk-heartbeat has a one-second resolution. It depends on the clocks on the master and slave servers being closely synchronized via NTP. –update checks happen on the edge of the second, and –monitor checks happen halfway between seconds. As long as the servers' clocks aren't skewed much and the replication events are propagating in less than half a second, mk-heartbeat will report zero seconds of delay.

(My servers' clocks are using NTP and are in sync.)

But Seconds_Behind_Master is hundreds of seconds behind so I would think they are not propagating in less than half a second so I'm still uncertain whether I am getting an accurate view of the mk-heartbeat utility or not.

Would love to hear from anyone that has deployed this tool for monitoring their MySQL replication.

Thanks in advance.

Cheers

Best Answer

You're close, but your problem is you have both instances pointing at the master. What you want is one instance updating the master every second, and the second instance reading the slave every second.

Also note it does not need to run on the actual database servers at all, it uses a regular mysql client connection. I run mine from my cacti server. Here's my sanitized /etc/rc.local for an example:

/usr/bin/mk-heartbeat -D maatkit -u maatkit -paardvark --update -h sql-master.fake.net --daemonize
/usr/bin/mk-heartbeat -D maatkit -u maatkit -paardvark -h sql-slave.fake.net --monitor --file /tmp/sql-slave.heartbeat --daemonize
Related Topic