Icinga: max_attempts vs max_check_attempts

icinga

A service definition in Icinga or Nagios config can have paramaters called max_attempts or max_check_attempts.

The docs describe max_attempts as:

If you've configured the max_attempts option of the service definition to be something greater than 1, Icinga will recheck the service before deciding that a real problem exists. While the service is being rechecked (up to max_attempts times) it is considered to be in a "soft" state (as described here) and the service checks are rescheduled at a frequency determined by the retry_interval option.

If Icinga rechecks the service max_attempts times and it is still in a non-OK state, Icinga will put the service into a "hard" state, send out notifications to contacts (if applicable), and start rescheduling future checks of the service at a frequency determined by the check_interval option.

And max_check_attempts as:

When a service first changes from an OK state to a non-OK state, Icinga gives you the ability to temporarily slow down or speed up the interval at which subsequent checks of that service will occur. When the service first changes state, Icinga will perform up to max_check_attempts-1 retries of the service check before it decides its a real problem. While the service is being retried, it is scheduled according to the retry_interval option, which might be faster or slower than the normal check_interval option. While the service is being rechecked (up to max_check_attempts-1 times), the service is in a soft state. If the service is rechecked max_check_attempts-1 times and it is still in a non-OK state, the service turns into a hard state and is subsequently rescheduled at the normal rate specified by the check_interval option.

On a side note, it you specify a value of 1 for the max_check_attempts option, the service will not ever be checked at the interval specified by the retry_interval option. Instead, it immediately turns into a hard state and is subsequently rescheduled at the rate specified by the check_interval option.

Those sound like the same thing to me. What's the difference between them, and when they should be used?

Best Answer

max_attempts is an old definition for services and hosts and it's no longer used in the nagios Core 4. See that to know the possibles definitions for yours objects: Objects definition

With Icinga2

# icinga2 -V
icinga2 - The Icinga 2 network monitoring daemon (version: r2.6.3-1)

The use of max_attempts in a service triggers an error :

Service declaration:

object Service "Intel(R) 82574L Gigabit Network Connection" {
  import "generic-service"
  host_name = "server"
  check_command = "check_netint"
  vars.interface = "Intel(R) 82574L Gigabit Network Connection"
  vars.warning= "650000"
  vars.critical ="800000"
  max_attempts=1
}

The config check:

# service icinga2 checkconfig
[....] checking Icinga2 configuration
information/cli: Icinga application loader (version: r2.6.3-1)
information/cli: Loading configuration file(s).
information/ConfigItem: Committing config item(s).
critical/config: Error: Attribute 'max_attempts' does not exist.
Location: in /etc/icinga2/conf.d/1.conf: 32:3-32:16
/etc/icinga2/conf.d/1.conf(30):   vars.warning= "650000"
/etc/icinga2/conf.d/1.conf(31):   vars.critical ="800000"
/etc/icinga2/conf.d/1.conf(32):   max_attempts=1
                                   ^^^^^^^^^^^^^^
/etc/icinga2/conf.d/1.conf(33): }
/etc/icinga2/conf.d/1.conf(34): /*

critical/config: 1 error
[FAIL] checking Icinga2 configuration. Check '/var/log/icinga2/startup.log' for details. ... failed!
Related Topic