Nagios CHECK_NRPE: Error receiving data from daemon. RHEL 6

nagiosrhel6

So I'm trying to set up a Nagios check_load service on a Red Hat server. I followed the documentation from Red Hat to install the NRPE client: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux_OpenStack_Platform/3/html/Installation_and_Configuration_Guide/Installing_and_Configuring_NRPE.html

On the Nagios GUI, I get this:

CHECK_NRPE: Error receiving data from daemon.

But when I log in to the Nagios server to debug, it looks like I can run this fine from the terminal:

root@portalmon:/Nagios# /usr/local/nagios/libexec/check_nrpe -H 10.0.XX.XXX -c check_load -t 30
OK - load average: 0.15, 0.10, 0.04|load1=0.150;15.000;30.000;0; load5=0.100;10.000;25.000;0; load15=0.040;5.000;20.000;0;

So I don't think it's a permissions problem. I think it's some kind of path issue, but I can't figure out why this is not working. Can anybody help?

Here is my service description:

define service{
  use                 dev-service
  host_name           [DEV] Luminis Admin DEV Portal
  service_description CPU-Load
  check_command       check_nrpe!"check_load"
}

I'm wondering if I'm even calling the same service when I run the terminal command above.

EDIT: I figured out the issue. Now working on the resolution. I changed Nagios to debug mode and by looking at /usr/local/nagios/var/nagios.debug I realized that the command that was being run was

/usr/local/nagios/libexec/check_nrpe -n -t 60 -H <hostname> -c check_load

instead of

/usr/local/nagios/libexec/check_nrpe -H 10.0.XX.XXX -c check_load -t 30

So something is appending the "-n" and it's causing issues. The second command returns what I want, but the first returns

CHECK_NRPE: Error receiving data from daemon.

EDIT 2: Figured it out but I don't have enough reputation to submit it as a valid answer, so below is what I typed.

I was able to resolve my issue. This is what I did.

  1. In Nagios.cfg, I turned on debug mode and found the location of the debug file.
  2. Replicated the issue while tailing the debug log. Found out the command the Nagios was executing was different than what I was expecting. It was appending the No SSL flag (-n).
  3. In commands.cfg, I searched for check_nrpe. Found out that it was executing the command with -n. There was another command called check_secure_nrpe which runs check_nrpe without the -n flag.
  4. Edited my service description to use check_secure_nrpe instead of check_nrpe.
  5. Restarted Nagios.

The service now works as expected.

Best Answer

I was able to resolve my issue. This is what I did.

  1. In Nagios.cfg, I turned on debug mode and found the location of the debug file.
  2. Replicated the issue while tailing the debug log. Found out the command the Nagios was executing was different than what I was expecting. It was appending the No SSL flag (-n).
  3. In commands.cfg, I searched for check_nrpe. Found out that it was executing the command with -n. There was another command called check_secure_nrpe which runs check_nrpe without the -n flag.
  4. Edited my service description to use check_secure_nrpe instead of check_nrpe.
  5. Restarted Nagios.

The service now works as expected.

Related Topic