So I'm trying to set up a Nagios check_load service on a Red Hat server. I followed the documentation from Red Hat to install the NRPE client: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux_OpenStack_Platform/3/html/Installation_and_Configuration_Guide/Installing_and_Configuring_NRPE.html
On the Nagios GUI, I get this:
CHECK_NRPE: Error receiving data from daemon.
But when I log in to the Nagios server to debug, it looks like I can run this fine from the terminal:
root@portalmon:/Nagios# /usr/local/nagios/libexec/check_nrpe -H 10.0.XX.XXX -c check_load -t 30
OK - load average: 0.15, 0.10, 0.04|load1=0.150;15.000;30.000;0; load5=0.100;10.000;25.000;0; load15=0.040;5.000;20.000;0;
So I don't think it's a permissions problem. I think it's some kind of path issue, but I can't figure out why this is not working. Can anybody help?
Here is my service description:
define service{
use dev-service
host_name [DEV] Luminis Admin DEV Portal
service_description CPU-Load
check_command check_nrpe!"check_load"
}
I'm wondering if I'm even calling the same service when I run the terminal command above.
EDIT: I figured out the issue. Now working on the resolution. I changed Nagios to debug mode and by looking at /usr/local/nagios/var/nagios.debug I realized that the command that was being run was
/usr/local/nagios/libexec/check_nrpe -n -t 60 -H <hostname> -c check_load
instead of
/usr/local/nagios/libexec/check_nrpe -H 10.0.XX.XXX -c check_load -t 30
So something is appending the "-n" and it's causing issues. The second command returns what I want, but the first returns
CHECK_NRPE: Error receiving data from daemon.
EDIT 2: Figured it out but I don't have enough reputation to submit it as a valid answer, so below is what I typed.
I was able to resolve my issue. This is what I did.
- In Nagios.cfg, I turned on debug mode and found the location of the debug file.
- Replicated the issue while tailing the debug log. Found out the command the Nagios was executing was different than what I was expecting. It was appending the No SSL flag (-n).
- In commands.cfg, I searched for check_nrpe. Found out that it was executing the command with -n. There was another command called check_secure_nrpe which runs check_nrpe without the -n flag.
- Edited my service description to use check_secure_nrpe instead of check_nrpe.
- Restarted Nagios.
The service now works as expected.
Best Answer
I was able to resolve my issue. This is what I did.
The service now works as expected.