Nagios3 “Return code of 127 is out of bounds”

nagiosping

Without making any changes to nagios3 config or OS (debian) filesystem changes when I add some extra devices (to the 12000+ on it already) suddenly

[1508925621] Warning: Return code of 127 for check of service 'PING' on host 'SOME-HOST.CISCO' was out of bounds. Make sure the plugin you're trying to run actually exists.
[1508925621] SERVICE ALERT: SOME-HOST.CISCO;PING;CRITICAL;HARD;3;(Return code of 127 is out of bounds - plugin may be missing)

All the binaries are readable/executable none of that has changed since setup.

It happens for ALL hosts of that type, bear in mind this is a setup that's worked for years non-stop the only thing I can think of is some kind of OS limit is hit when running the checks as that's the only thing that changes, more hosts.
I've had max_concurrent_checks=1500 for a long time. (Its a 16 core 24GB RAM physical server)

Apart from the concurrent checks I run

check_result_reaper_frequency=25
max_check_result_reaper_time=20

The large group of hosts are configured as such:

define host{
        use                     generic-cisco
        host_name               SOME_HOST.CISCO
        alias                   SOME_HOST.CISCO
        address                 xxx.xxx.xxx.xxx
        check_command   check-host-alive
        hostgroups              cisco_devices
        }

define service{
        use                     generic-service
        host_name               SOME_HOST.CISCO
        service_description     PING
        check_command           check_ping!200.0,20%!600.0,60%
        normal_check_interval   10
        retry_check_interval    5
        }

The only thing to make return it to a working state is to take off some of the more recent hosts I've added and stop and start and hope it runs fine. Any suggestions?

Best Answer

What fixed it was although I had many other performance recommendations followed I hadn't disabled enable_environment_macros Not a dent in performance now. Apparently the problem was the OS was struggling with making those environment vars available at that amount of hosts.. Found through here

I like a good nagios facepalm.