.bat file – Nagios v3.2 service check and start if stopped

batchbatch-filenagiosscripting

I'm just barely getting into programming so I do apologize for my ignorance. I'm trying to create a .bat file that will check if a service is running on XP Pro.

If service is running it will exit 0.
If the service is stopped start service
wait 10 seconds (via ping i'm guessing)
check if service is running
if service is running exit 0
if service is stopped start service
wait 10 seconds

Do this check a total of 3 times. if service does not come up within that time:
exit 2

Exit 0 = ok
exit 1 = warning
exit 3 = critical (and this will alert)

I need to do this for 3 different services but i'm expecting that it would be better to create one per service. That way you get notified on the specific service that is not coming back up.

The goal is that if the service stops it will start it. If after 30 seconds it is unable to start the service then it will send an alert.

The reason I'm trying to do it with a .bat is this is consistent with all other scripts and I did not want to complicate it further by adding different kinds of code. Yay for consistency!

Again I do apologize for my ignorance I've been thrown into this project last minute.

Thank you for the help and reading my question!

Best Answer

Checking to see if a service is running is actually built in to NSClient++. I'm restarting the service if it's down with an NRPE event handler. Basically, if the service is stopped, NRPE will run a script.

The script is

@echo off
net start %1
@exit 0

and I've defined the event handler in nsclient.ini as, for example:

restartwsus=scripts\runcmd.bat wsusservice

under the ; A list of scripts available to run from the CheckExternalScripts module. Syntax is: <command>=<script> <arguments> header.

(Restart NSClient++)

On the Nagios server, I've defined the command in commands.cfg as:

define command{
 command_name restartwsus
 command_line /usr/lib/nagios/plugins/check_nrpe -H '$HOSTADDRESS$' -c restartwsus
}

and have defined the service as:

define service{
        use                     generic-service
        host_name               wsusserver
        service_description     WSUS
        contacts                me
        notification_options    w,c,r
        notification_period     24x7
        notification_interval   0
        check_command           check_nt!SERVICESTATE!-d SHOWALL -l WsusService
        event_handler           restartwsus
        }

I hope that helps.