Nagios escalations, prematurely critical escalation after warning

monitoringnagios

In Nagios 3, I would like a service to be escalated after being critical XX minutes. It works great on services that go from UP to CRITICAL. However, if the service has been warning >XX minutes (say for disk space that slowly is going up), and goes critical, the very first failure is triggering an escalation. It is counting the warnings to the escalation count, where as we want it to fail after 3 CRITICAL alarms, not 3 warnings and one critical.

Is there a solution that will allow me to ignore the warnings from counting towards the service check escalation?

Here's an example of another user with the same problem – and very similar configs. http://copilotco.com/mail-archives/nagios-users.2009/msg00310.html)

Best Answer

As i don't use escalations in my Nagios implementation i will speak blindly, just regarding the documentation for Service Escalation definition.

You may have to consider the first_notification directive :

first_notification: This directive is a number that identifies the first notification for which this escalation is effective. For instance, if you set this value to 3, this escalation will only be used if the service is in a non-OK state long enough for a third notification to go out.

And also consider the escalation_options directive :

escalation_options: This directive is used to define the criteria that determine when this service escalation is used. The escalation is used only if the service is in one of the states specified in this directive. If this directive is not specified in a service escalation, the escalation is considered to be valid during all service states. Valid options are a combination of one or more of the following: r = escalate on an OK (recovery) state, w = escalate on a WARNING state, u = escalate on an UNKNOWN state, and c = escalate on a CRITICAL state. Example: If you specify w in this field, the escalation will only be used if the service is in a WARNING state.

So, to achieve what you want (escalation after 3 CRITICAL alarms), i would try a definition like this :

define serviceescalation{
    host_name              myhost
    service_description    Disk Usage
    first_notification     3
    last_notification      0
    notification_interval  10
    contact_groups         admins
    escalation_options     c,r
    }

Hope it will help...and work...!

Related Solutions

Nagios Notification Escalation

I think it would work if you specified the time period on the contact. Define the contact twice: once with notifications at night, and again with notifications only during the day.

define service{
    use                             generic-service
    host_name                       mercury
    service_description             ROB_TEST2
    check_command                   check_pop
    contacts                        rob_daytime, rob_nighttime
    }

define serviceescalation{
    host_name                       mercury
    service_description             ROB_TEST2
    first_notification              3
    last_notification               5
    notification_interval           30
    contacts                        rob_daytime, rob_nighttime
    }

define serviceescalation{
    host_name                       mercury
    service_description             ROB_TEST2
    first_notification              6
    last_notification               9999
    notification_interval           60
    contacts                        rob_daytime
    }

define contact{
    contact_name                    rob_daytime
    service_notification_period daytime
    ...
    }

define contact{
    contact_name                    rob_nighttime
    service_notification_period nighttime
    ...
    }

This should give you a good night's sleep even though an escalation has been running for a few days.

Note: I haven't tested this myself ;-)

Nagios escalations with no contacts defined directly on the service checks

I always look at these warings this way. If one is new to Nagios, then Nagios is suggesting something may not be quite right and maybe you should look at it. Once yo uknow what you are doing, ignoring these warnings is an informed decision. If it works for you, stick with it.

Best Answer

Related Solutions

Nagios Notification Escalation

Nagios escalations with no contacts defined directly on the service checks

Related Topic