Dynamically setting check_interval parameter based on Service_State in Icinga2

icingaicinga2system-monitoring

I have a requirement where check interval is 180 mins while notification interval is 10 mins. Means service owner wants if he miss any alert that usually comes after 180 mins if service is critical then Icinga keeps on checking and informing them every 10 mins until service goes back to normal.

i tried interval = 0 parameter in notification.conf but that is not fulfilling the requirement.

It sends alert every 10 mins if service is not okay but it doesn't check the service.

For eg. if service comes to normal before 180 mins(i.e. next check) , it'll keep on alerting till next check.

Found similar question here
but it is for Nagios & i'm not able to merge it with Icinga2.

I'm sure it will be done using CHANGE_NORMAL_SVC_CHECK_INTERVAL parameter but don't know how to implement it.

I also found below Icinga page:

Icinga external commands link

Kindly help.

Best Answer

This is what I did to resolve my issue.

1. Created script /icinga/plugins/change_check_interval.sh

#!/bin/bash

now=`date +%s`
commandfile='/var/run/icinga2/cmd/icinga2.cmd'
case "$1" in
    OK)
        /usr/bin/printf "[%lu] CHANGE_NORMAL_SVC_CHECK_INTERVAL;servername;servicename;180\n" $now >> $commandfile
    ;;
    WARNING)
        ;;
    UNKNOWN)
        ;;
    CRITICAL)
        /usr/bin/printf "[%lu] CHANGE_NORMAL_SVC_CHECK_INTERVAL;servername;servicename;10\n" $now >> $commandfile
        ;;
esac

exit 0

2. Then used this script to define event_command in commands.conf

object EventCommand  "change_check_interval"{
  import "plugin-event-command"
    command = [ "/icinga/plugins/change_check_interval.sh", "$service.state$" ]
}

3. And used event_command in services.conf

apply Service "Service-Name" {
 import "template"
  check_command = "nrpe-arg"
  vars.remote_nrpe_command = "nrpe command"
  vars.remote_nrpe_arguments = "arg1"
  event_command = "change_check_interval"
  assign where host.name == "servername"
}

This eventhandler runs every 180 mins when service is okay while run every 10 mins when service is critical.