Monitoring an ISC DHCP Failover Status Using Nagios

dhcp-serverfailoverisc-dhcpnagios

I recently implemented ISC DHCP Failover and it's working beautifully, but I'd like to monitor the current status of the failover using Nagios.

Ultimately, I would like my Nagios check to:

  • report a WARNING when the Secondary DHCP server kicks in (starts serving addresses due to an issue with the Primary)
  • report a CRITICAL when both the Primary and Secondary aren't active.
  • Monitoring the dhcpd process to see if it is running or not is unfortunately not the true solution — a failover state can activate even when dhcpd is still running.

    From what I've researched, it appears dhcpd cannot be queried for a current status. Aside from parsing log files, does anyone know of a clean way to determine whether a dhcpd server is currently in a failover state or not?

    Best Answer

    For monitoring a failover setup, I would use OMAPI commands (eg. via omshell) to check the status of the failover-state object.

    Considering you have properly configured your dhcpd server to enable OMAPI access, you could use something like:

    $ cat check-failover.cmd <<EOF
    server localhost
    port 7911
    key omapi_key <KEY>
    connect
    new failover-state
    set name = "<FAILOVER NAME>"
    open
    EOF
    

    Then use it like:

    $ omshell < check-failover.cmd | grep state
    partner-state = 00:00:00:02
    local-state = 00:00:00:02
    

    With state being an integer which value is described in dhcpd.8 man page (1 is "startup", 2 is "normal", etc.).

    It should be quite easy to write a Nagios/Shinken probe from this.

    Related Topic