I have Nagios monitoring an Oracle installation on a different server. Sometimes one particular test (check oracle tablespace can allocate next extent) will fail with "CRITICAL – Plugin timed out after 10 seconds".
The first thing I want to do is figure out how long it actually takes to complete. If it's 11 seconds, maybe I don't care, and I just want to set the timeout a little higher.
I tried setting the timeout for check_by_ssh, which is used to run the actual command, like so:
define command {
command_name check_ssh_oracle_health
command_line $USER1$/check_by_ssh -H $HOSTADDRESS$ -C
"/export/home/nagios/libexec/check_oracle_health --mode=$ARG1$ --environment
ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_1
--connect=nagios/<pwd>@<SID> --timeout=15"
}
This had no effect; the test still errors out, and still says it happened after 10 seconds (and yes, I did restart Nagios :).
The only other place I can see to set a timeout is in nagios.cfg; that seems like too high-level (it would affect all tests) and besides, none of them are currently set to 10 seconds so I doubt this is the right place.
Any pointers?
Best Answer
I think that is the
check_by_ssh
that's timing out (10 seconds is the default timeout for this check) and not thecheck_oracle_health
inside it. Try to set the timeout ofcheck_by_ssh
to a higher value and see if it still happens.Hope this helps!