Windows AD – How to Find the Cause of Locked User Account

active-directorywindows

After a recent incident with Outlook, I was wondering how I would most efficiently resolve the following problem:

Assume a fairly typical small to medium sized AD infrastructure: several DCs, a number of internal servers and windows clients, several services using AD and LDAP for user authentication from within the DMZ (SMTP relay, VPN, Citrix, etc.) and several internal services all relying on AD for authentication (Exchange, SQL server, file and print servers, terminal services servers). You have full access to all systems but they are a bit too numerous (counting the clients) to check individually.

Now assume that, for some unknown reason, one (or more) user account gets locked out due to password lockout policy every few minutes.

  • What would be the best way to find the service/machine responsible for this ?
  • Assuming the infrastructure is pure, standard Windows with no additional management tool and few changes from default is there any way the process of finding the cause of such lockout could be accelerated or improved ?
  • What could be done to improve the resilient of the system against such an account lockout DOS ? Disabling account lockout is an obvious answer but then you run into the issue of users having way to easily exploitable passwords, even with complexity enforced.

Best Answer

Adding something I don't see in the answers given.

What would be the best way to find the service/machine responsible for this ?

You can't just look at the Security log on the PDCe, because, while the PDCe does have the most up-to-date information regarding account lockouts for the entire domain, it does not have the information about from which client (IP or hostname) the failed logon attempts came from, assuming that the failed logon attempts occurred on another DC besides the PDCe. The PDCe will say that "Account xyz was locked out," but it won't say from where, if the failed logons were occurring on another DC in the domain. Only the DC that actually validated the logon will record the logon failure, including the client's address. (Also not bringing RODCs into this discussion.)

There are two good ways to find out where failed logon attempts are coming from when you have several domain controllers. Event forwarding, and Microsoft's Account Lockout Tools.

I prefer event forwarding to a central location. Forward failed logon attempts from all your domain controllers to a central logging server. Then you only have one place to go look for failed logons in your entire domain. In fact I personally don't really love Microsoft's Account Lockout tools, so now there's one good way.

Event forwarding. You'll love it.

Assuming the infrastructure is pure, standard Windows with no additional management tool and few changes from default is there any way the process of finding the cause of such lockout could be accelerated or improved?

See above. You can then have your monitoring system, such as SCOM or Nagios or whatever you use, comb that single event log and blow up your cell phone with text messages or whatever. It doesn't get more accelerated than that.

What could be done to improve the resilient of the system against such an account lockout DOS?

  1. User education. Tell them to stop setting up Windows services to run under their domain user accounts, log off of RDP sessions when they're done, teach them how to clear the Windows Credential Vault of cached passwords for Outlook, etc.
  2. Use Managed Service Accounts where you can so users no longer have to manage passwords for those user accounts. Users muck up everything. If you give a user a choice, he or she will always make the wrong choice. So don't give them a choice.
  3. Enforcing remote session timeouts via GPO. If a user is idle in an RDP session for 6 hours, kick them off.