Ubuntu – Unable to add new munin node to existing munin master

graphmuninUbuntuubuntu-10.04

I'm trying to add a node to an existing munin master (which I didn't setup but which seems to be working fine as it shows graphs for 8 existing nodes) and I'm having some troubles. Here are the steps I followed:

Master

Added the node to /etc/munin/munin.conf

[server.example.org]
   address private.server.example.org

The html directory of the master is (matches the apache configuration):

htmldir /opt/munin

That directory contains the following files and folders:

ls -lh /opt/munin/
drwxr-xr-x 20 munin munin 4.0K 2011-11-07 16:15 example.org <= FOLDER NAMED AFTER OUR DOMAIN
-rw-r--r--  1 munin munin 2.5K 2010-08-03 14:11 definitions.html
-rw-r--r--  1 munin munin 3.0K 2010-08-03 14:11 favicon.ico
-rw-r--r--  1 munin munin  15K 2011-11-07 16:21 index.html  <= MAIN MUNIN PAGE
-rw-r--r--  1 munin munin 1.8K 2010-08-03 14:11 logo-h.png
-rw-r--r--  1 munin munin  473 2010-08-03 14:11 logo.png
-rw-r--r--  1 munin munin 5.6K 2010-11-03 14:07 style.css

The footer of index.html indicates that this file is generated dynamically by munin so I know I don't have to touch this file.

This page was generated by <a href='http://munin-monitoring.org/'>Munin</a> version 1.4.4 at 2011-11-07 16:21:30+0000 (UTC)

The domain directory contains folders for all the nodes. I ended up creating one for the new node hoping it would help but it made no difference

mkdir /opt/munin/example.org/server.example.org
chown munin:munin -R /opt/munin/example.org/server.example.org

I killed munin-cron and restarted it but the makes no difference either.

$ sudo su munin munin-cron start
$ sudo ps aux | grep munin-cron
munin    26566  0.0  0.2   4092   584 ?        Ss   16:35   0:00 /bin/sh -c if [ -x /usr/bin/munin-cron ]; then /usr/bin/munin-cron; fi
munin    26567  0.0  0.2   4092   576 ?        S    16:35   0:00 /bin/sh /usr/bin/munin-cron

Munin node

Installed munin-node package

apt-get install munin-node

Modified the /etc/munin/munin-node.conf file to allow accces from the munin master

host *
allow ^A\.B\.C\.D$  # master IP address
port 4949

Restarted munin node

service munin-node start

If I run a tcpdump on the new node I can see some data being exchanged with the master so I believe at this point the issue is with configuring the master.

Any idea as to what I'm issing or how I can troubleshoot this further?

Additional troubleshooting

As advised I checked the logs

$ grep server.example.org /var/log/munin/munin-update.log

2011/11/08 08:40:03 [WARNING] Config node server.example.org listed no services for server.example.org.  Please see http://munin-monitoring.org/wiki/FAQ_no_graphs for further information.
2011/11/08 09:10:02 [INFO] Reaping Munin::Master::UpdateWorker<example.org;server.example.org>.  Exit value/signal: 0/0

The warning brought me to this page http://munin-monitoring.org/wiki/FAQ_no_graphs. I followed steps by steps the advised given. Although the symlinks seemed to be properly created I did run the command munin-node-configure --shell | sh -x which believe fixed the issue. The aforementioned page also recommended to change set host_name which I did (although I don't believe it helped since the other working nodes don't have it configured).

The telnet troubleshooting was successful by the time I got to it

$ telnet private.server.example.org 4949
Trying A.B.C.D...
Connected to private.server.example.org.
Escape character is '^]'.
# munin node at server.example.org

> nodes
server.example.org
.

> list server.example.org
cpu df df_inode entropy forks fw_conntrack fw_forwarded_local fw_packets if_err_eth0 if_err_eth1 if_eth0 if_eth1 interrupts iostat iostat_ios ip_A.B.C.D irqstats load memory open_files open_inodes postfix_mailqueue postfix_mailvolume proc_pri processes swap threads uptime users vmstat

> fetch df
_dev_sda1.value 23.1295909196156
_dev.value 1.2890625
_dev_shm.value 0
_var_run.value 0.00782368542525642
_var_lock.value 0
_lib_init_rw.value 0

Best Answer

I can't see anything obviously wrong with your setup. I will suggest two things;

  • Read the logs on the munin-master. /var/log/munin/munin-update.log is the place to start. If you have entries confirming that an update is successful, and you got the rrd-files in /var/lib/munin/ - continue to munin-graph.log and munin-html.log

  • Verify that the master is able to connect to the address of the munin-node. Please test with netcat or similar: nc private.server.example.org 4949. Expected output should be: # munin node at hostname. Possible errors are packets being dropped by a firewall (whereas nc will hang at connect(), visible if you use strace), or failing to resolve the name (whereas netcat outputs nc: getaddrinfo: Name or service not known).

If you can't find anything after trying the above, please paste a complete munin.conf from the master, (anonymize numeric IP-addresses with numbers, and hostnames with some bogus text if you have to).

Not too uncommon error; The cron-job may have been invoked by root at some point, where some files have root-ownership and aren't possible to be updated by the munin-user, who usually needs write access to all files in /var/lib/munin and the html-directory.

Related Topic