Nagios – Resolving CHECK_NRPE: Socket Timeout After 30 Seconds

debiannagios

I have nagios on my server which is alerting me

CHECK_NRPE: Socket timeout after 30 seconds. 

But my service is running:

● nagios-nrpe-server.service - Nagios Remote Plugin Executor
   Loaded: loaded (/lib/systemd/system/nagios-nrpe-server.service; enabled; vendor preset: enabled)
   Active: active (running) since Sat 2020-04-18 00:31:56 CEST; 6min ago
     Docs: http://www.nagios.org/documentation
  Process: 4841 ExecStopPost=/bin/rm -f /var/run/nagios/nrpe.pid (code=exited, status=0/SUCCESS)
 Main PID: 4845 (nrpe)
    Tasks: 5 (limit: 4915)
   CGroup: /system.slice/nagios-nrpe-server.service
           ├─4845 /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -f
           ├─6346 /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -f
           ├─6347 /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -f
           ├─6348 sh -c /usr/lib/nagios/plugins/check_disk -e -w 5% -W 3% -c 2% -K 2% -X tmpfs
           └─6349 /usr/lib/nagios/plugins/check_disk -e -w 5% -W 3% -c 2% -K 2% -X tmpfs

I tryed to kill it, restart it, restart nagios server which is processing all alerts, but nothing worked. It started from nothing, and don't know there is the catch since all other server which are nagios are working.

Best Answer

The timeout is coming from the client plugin check_nrpe, terminating the connection after 30 seconds.

I don't think there is a proper timeout in the NRPE servers, but most plugins should implement timeout behavior.