Munin-run thesql_ plugin returns data but munin server gets ‘Bad exit’ error

mariadbmunin

I've a Munin server with several Munin nodes, all configured using Ansible so configuration it's pretty the same. Systems are Debian Jessie.

The mysql_* graphs from two of the servers are present but empty while the other three servers have full graphs.

Running munin-run --debug mysql_commands in one of the faulty servers returns correct output:

# munin-run --debug mysql_commands
# Processing plugin configuration from /etc/munin/plugin-conf.d/ansible.conf
# Processing plugin configuration from /etc/munin/plugin-conf.d/munin-node
# Setting /rgid/ruid/ to /119/0/
# Setting /egid/euid/ to /119 119/0/
# Setting up environment
# Environment mysqluser = debian-sys-maint
# Environment mysqlconnection = DBI:mysql:mysql;mysql_read_default_file=/etc/mysql/debian.cnf
# Environment mysqlopts = --defaults-file=/etc/mysql/debian.cnf
# About to run '/etc/munin/plugins/mysql_commands'
Com_delete.value 4546376
Com_insert.value 2804559
Com_insert_select.value 341479
Com_load.value 0
Com_replace.value 0
Com_replace_select.value 0
Com_select.value 236967004
Com_update.value 7069348
Com_update_multi.value 0

So it seems Munin node is working ok.

But when I execute munin-cron in the munin server the plugin returnsbad exit when fetched:

# sudo -u munin munin-cron --host server.com --debug
...
2018/01/09 08:35:01 [DEBUG] for my mysql_commands (irqstats mysql_innodb_tnx diskstats apache_accesses smart_sda df ntp_offset vmstat fw_packets mysql_select_types netstat users mysql_qcache mysql_table_locks swap mysql_slow mysql_myisam_indexes if_err_eth0 apt hddtemp_smartctl ntp_kernel_err ntp_kernel_pll_freq mysql_innodb_rows mysql_innodb_bpool forks ntp_kernel_pll_off cpu load postfix_mailvolume postfix_mailqueue mysql_qcache_mem open_inodes processes http_loadtime open_files uptime mysql_commands mysql_innodb_semaphores apt_all mysql_innodb_io smart_sdb memory mysql_connections mysql_files_tables mysql_network_traffic df_inode mysql_sorts if_eth0 entropy proc_pri mysql_innodb_bpool_act apache_processes threads mysql_innodb_log mysql_innodb_insert_buf mysql_tmp_tables mysql_innodb_io_pend interrupts apache_volume)
2018/01/09 08:35:01 [DEBUG] Fetching service configuration for 'mysql_commands'
2018/01/09 08:35:01 [DEBUG] Writing to socket: "config mysql_commands
".
2018/01/09 08:35:02 [DEBUG] Reading from socket: "graph_vlabel Commands per ${graph_period}\ngraph_total Questions\ngraph_args --base 1000\ngraph_title Command Counters\ngraph_category mysql2\nCom_delete.type DERIVE\nCom_delete.label Delete\nCom_delete.draw STACK\nCom_delete.min 0\nCom_insert.draw STACK\nCom_insert.min 0\nCom_insert.type DERIVE\nCom_insert.label Insert\nCom_insert_select.type DERIVE\nCom_insert_select.label Insert select\nCom_insert_select.draw STACK\nCom_insert_select.min 0\nCom_load.label Load Data\nCom_load.type DERIVE\nCom_load.min 0\nCom_load.draw STACK\nCom_replace.type DERIVE\nCom_replace.label Replace\nCom_replace.draw STACK\nCom_replace.min 0\nCom_replace_select.draw STACK\nCom_replace_select.min 0\nCom_replace_select.type DERIVE\nCom_replace_select.label Replace select\nCom_select.type DERIVE\nCom_select.label Select\nCom_select.draw STACK\nCom_select.min 0\nCom_update.type DERIVE\nCom_update.label Update\nCom_update.draw STACK\nCom_update.min 0\nCom_update_multi.label Update multi\nCom_update_multi.type DERIVE\nCom_update_multi.min 0\nCom_update_multi.draw STACK".
2018/01/09 08:35:02 [DEBUG] config: 0.151266 sec for 'mysql_commands' on server.com/1.2.3.4/4949
2018/01/09 08:35:02 [DEBUG] Now parsing config output from plugin mysql_commands on server.com
2018/01/09 08:35:02 [DEBUG] update_rate 0 for mysql_commands on server.com/1.2.3.4:4949
2018/01/09 08:35:02 [DEBUG] No service data for mysql_commands, fetching it
2018/01/09 08:35:02 [DEBUG] Writing to socket: "fetch mysql_commands
".
2018/01/09 08:35:02 [DEBUG] data: 0.175058 sec for 'mysql_commands' on server.com/1.2.3.4/4949
2018/01/09 08:35:02 [DEBUG] Now parsing fetch output from plugin mysql_commands on server.com/1.2.3.4:4949
2018/01/09 08:35:02 [FETCH from mysql_commands] # Bad exit
...

This happens with all mysql_* plugins in two servers, but they work ok in other three servers. Other Munin plugin work ok in all servers, so general configuration seems ok.

Configuration for mysql_* plugins:

[mysql*]
user root
env.mysqlopts --defaults-file=/etc/mysql/debian.cnf
env.mysqluser debian-sys-maint
env.mysqlconnection DBI:mysql:mysql;mysql_read_default_file=/etc/mysql/debian.cnf

No new entries are added to Munin node log (/var/log/munin/munin-node.log) nor MySQL when sudo -u munin munin-cron is executed.

Munin node service reports normal operation:

# systemctl status munin-node
● munin-node.service - Munin Node
   Loaded: loaded (/lib/systemd/system/munin-node.service; enabled)
   Active: active (running) since mié 2018-01-17 14:20:10 CET; 19s ago
     Docs: man:munin-node(1)
           http://munin.readthedocs.org/en/stable-2.0/reference/munin-node.html
  Process: 4515 ExecStart=/usr/sbin/munin-node $DAEMON_ARGS (code=exited, status=0/SUCCESS)
 Main PID: 4541 (munin-node)
   CGroup: /system.slice/munin-node.service
           └─4541 /usr/bin/perl -wT /usr/sbin/munin-node

ene 17 14:20:10 example.com systemd[1]: Started Munin Node.
  • Munin version (nodes and servers): Debian package 2.0.25-1+deb8u3.

  • Database: mariadb-server Debian Package 10.0.26-0+deb8u1.

What can be wrong with those nodes? Or how can I debug the problem?

Best Answer

Shame on me.

Two munin-node were configured with the same IP, thus one server was never contacted and the other was asked for the stats twice. That double request made mysql_ plugin to behave badly.

This issue explains other minor problemas I'd detected but queued for resolving after this one (for example, strange disk usage).

Interesting the way to debug plugins from master:

$ nc example.com 4949
# munin node at example.com
fetch mysql_commands
Com_delete.value 5481602
Com_insert.value 3468782
Com_insert_select.value 437696
Com_load.value 0
Com_replace.value 0
Com_replace_select.value 0
Com_select.value 295884041
Com_update.value 8498783
Com_update_multi.value 6
.

See Debugging Munin Plugins.