Windows – How windows service reset failure count setting work

windowswindows-service

For testing purposes I've got windows service which I'd like to restart 1s after first and 5s after second failure. Every subsequent failure within an 2min should leave service stopped. To fulfill this criteria I've tried to use "Reset fail count after" configuration but without luck. On my machine (Windows7 Enterpise x64) I've create simple service which unexpectedly close itself in 5s from start. With configuration:

sc failure FAIL1 reset= 120 actions= restart/1000/restart/5000//

I've expected that after those 2min service failure counter will be reseted and restart sequence repeated. But maybe when service is left in stopped state after second failure it will never have chance to reset counter. So I've got another service with configuration:

sc failure FAIL2 reset= 120 actions= restart/1000/restart/5000

It turns out that FAIL2 service after second failure keeps restarting in 5s and available reset counter in event log error massages is still growing.

Using third service (FAIL3) I've tried to configure simlar options using available resolution in Recovery tab of service properties:

  1. First failure: restart
  2. Second failure: restart
  3. Subsequent:
    take no action
  4. Reset fail count after: 1 (day)
  5. Restart service after: 0 (minute)

This one behaves similar to FAIL1. Executes twice and after one day stays stopped.

It looks like on my machine reseting counter timeout does not work. Maybe my imaginary configuration is not possible to achieve using this technic. Or maybe my reset counter understanding is incorrect.

Best Answer

My understanding of windows service counter was completly wrong. The most important thing here is:

Reset fail count after setting is amount of time from service start after which fail counter will be reseted.

According to defined sample configurations where each service close unexpectedly after 5s from start each case never hits reset counter definition. To reset counter timeout should be less then 5s ex.

sc failure FAIL1 reset= 3 actions= restart/1000/restart/5000/restart/10000

In result we have service which constantly restarts after 1s and every event log error message contains information:

It has done this 1 time(s)

Idea where 2 service failures must leave service stopped within 2min is impossible to achive this way. All we can do is to set reset counter longer then 5s and every subsequent failure fo 2min restarts:

sc failure FAIL1 reset= 10 actions= restart/1000/restart/5000/restart/1200000