Nagios: executing plugin with nrpe produces a different result from running locally

nagios

I'm trying to add an NRPE check to to monitor Puppet agent but I'm running into difficulty getting the plugin to return the correct result when executed locally.

I'm using this plugin:

when executing the script locally on the nagios client the result is correct but executing with nrpe result in a critical result. I assume I've missed something in my config. Other nrpe plugins are executing successfully.

I restarted nrpe.d (and checked while it was down that no nrpe pid was running)
Permissions, ownder and group for the check_puppet file are the same as my other checks

[root@puppet-master]# /usr/lib64/nagios/plugins/check_nrpe -H server.addr -c check_puppet
CRITICAL: Puppet daemon not running or something wrong with process

[root@git nrpe.d]# /usr/lib64/nagios/plugins/check_puppet
OK: Puppet agent "3.4.3" running catalogversion 1398787991, and executed at Tue 29 Apr 2014 04:13:25 PM UTC for last time

nagios_commands.cfg:

define command {
    command_line                   $USER1$/check_nrpe -H $HOSTADDRESS$ -t 15 -c check_puppet
    command_name                   check_nrpe_puppet
}

nagios_service.cfg:

define service {
    ## --PUPPET_NAME-- (called '_naginator_name' in the manifest)                    check_puppet
    check_command                  check_nrpe_puppet
    host_name                      server.addr
    service_description            check_puppet
    use                            generic-service
}

/etc/nrpe.d/nrpe-check_puppet

# Configuration for check_puppet (from the generic template)
command[check_puppet]=/usr/lib64/nagios/plugins/check_puppet

For reference here is a working config of mine

define command {
    command_line                   $USER1$/check_nrpe -H $HOSTADDRESS$ -t 15 -c check_ram
    command_name                   check_nrpe_ram

}

define service {
    ## --PUPPET_NAME-- (called '_naginator_name' in the manifest)                check_ram_server.addr
    check_command                  check_nrpe_ram
    host_name                      server.addr
    service_description            ram
    use                            generic-service
}

/etc/nrpe.d/nrpe-check_ram

# Configuration for check_ram (from the generic template)
command[check_ram]=/usr/lib64/nagios/plugins/check_ram -w 10% -c 5%

Update:

I had added Nagios user to sudoers as instructed in the readme but had not tested running the check as Nagios user. This failed because the path allowed in the sudoers list was incorrect (my plugin is in Lib64), also NRPE runs as nrpe users on my systems.

I corrected sudoers to grant nopasswed sudo for the correct folder to nrpe user and added an nrpe shell so I can test as that user (it was set to nologin)

bash-4.1$ whoami
nrpe
bash-4.1$ /usr/lib64/nagios/plugins/check_puppet 
UNKNOWN: last_run_summary.yaml not found, not readable or incomplete
bash-4.1$ exit
exit
[root@ip-10-185-165-196 plugins]# ps auxww | grep nrpe 
nrpe     16353  0.0  0.0  41320  1364 ?        Ss   23:33   0:00 /usr/sbin/nrpe -c   /etc/nagios/nrpe.cfg -d
root     16814  0.0  0.0 103236   856 pts/0    S+   23:53   0:00 grep nrpe
[root@ip-10-185-165-196 plugins]# 

On the nagios server:

[root@puppet-master plugins]# ./check_nrpe -H <myserver> -t 15 -c check_puppet
CRITICAL: Puppet daemon not running or something wrong with process

I'm running a minimal install of CentOS 6.5

I disabled requiretty with:

Defaults:nrpe    !requiretty

UPDATE 3:

Looks like SELinux is to blame. setenforce 0 solved the issues.
$setenforce 0

Best Answer

As yoonix points out, the plugin itself is pretty clear, on lines 36-38: it's just a wrapper around a core plugin, and that core plugin needs to run as root. That's why it worked fine when you ran it as root. The wrapper will escalate privilege via sudo; it's set up to execute the sudo itself, but you will need to provide appropriate sudo privileges.

Assuming your nrpe runs as the user nagios, the plugin says you'll need the following line in your sudoers file:

nagios ALL=NOPASSWD:/usr/bin/puppet,/usr/lib/nagios/plugins/check_puppet_agent,/bin/kill

(I'm not sure why it needs /bin/kill, but it says it does, so you'd probably better grant it or risk the plugin failing in interesting and under-documented ways.)

You don't tell us what your OS (and, if Linux, distro) is; if it were CentOS and you were using the RPMforge nrpe, it would run as user nagios. You will need to find out what user your nrpe runs as, and substitute that user for the leading nagios in the sudoers line above.