I have nagios on my server which is alerting me
CHECK_NRPE: Socket timeout after 30 seconds.
But my service is running:
● nagios-nrpe-server.service - Nagios Remote Plugin Executor
Loaded: loaded (/lib/systemd/system/nagios-nrpe-server.service; enabled; vendor preset: enabled)
Active: active (running) since Sat 2020-04-18 00:31:56 CEST; 6min ago
Docs: http://www.nagios.org/documentation
Process: 4841 ExecStopPost=/bin/rm -f /var/run/nagios/nrpe.pid (code=exited, status=0/SUCCESS)
Main PID: 4845 (nrpe)
Tasks: 5 (limit: 4915)
CGroup: /system.slice/nagios-nrpe-server.service
├─4845 /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -f
├─6346 /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -f
├─6347 /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -f
├─6348 sh -c /usr/lib/nagios/plugins/check_disk -e -w 5% -W 3% -c 2% -K 2% -X tmpfs
└─6349 /usr/lib/nagios/plugins/check_disk -e -w 5% -W 3% -c 2% -K 2% -X tmpfs
I tryed to kill it, restart it, restart nagios server which is processing all alerts, but nothing worked. It started from nothing, and don't know there is the catch since all other server which are nagios are working.
Best Answer
The timeout is coming from the client plugin
check_nrpe
, terminating the connection after 30 seconds.I don't think there is a proper timeout in the NRPE servers, but most plugins should implement timeout behavior.