gMSA Account – Authentication Failure During Password Rotation

active-directorysql serverwindows

When our gMSA accounts are automatically rotated, we see login failures for around 1-10 minutes. This is particularly apparent for gMSA client accounts that connect to MS SQL server, but I think it happens for other gMSA accounts as well. MS SQL server is not running as a gMSA account, but our application uses gMSA to make a client connection to SQL. By default ManagedPasswordIntervalInDays is every 30 days, so we see this every month at the same time.

When I check the domain controller logs, I don't see any login failures for the gMSA user, but the SQL server logs the following error

SSPI handshake failed with error code 0x8009030c, state 14 while establishing a connection with integrated security; the connection has been closed. Reason: AcceptSecurityContext failed. The operating system error code indicates the cause of failure. The logon attempt failed [CLIENT: x.x.x.x]

From what I have found, this error usually indicates the wrong username/password combination.

This occurs on multiple clients, and each eventually starts connecting again after anywhere from 1-10 minutes. The clients don't all start connecting at the same time, but it seems to be randomly within that time window.

Initially I thought it might be related to AD replication of the changed password, so we modified the default inter-site replication interval to USE_NOTIFY to replicate immediately. If replication were the issue, I would expect to see login failures on DC's and I'm not seeing logon failures on DC's.
I had also thought that maybe the SQL server is caching the authentication token, but if that were the case, I would expect to see all clients resolve at the same time (ie when the SQL server refreshed)
Being that the clients each start working again at a different time, it doesn't appear to be on the SQL server side, but more likely something on the client side. Maybe caching the gMSA password or maybe something related to timeout and retry back offs.

Best Answer

We were generating the same error because of a SPN issue that caused the gMSA to authenticate to sql server via NTLM instead of Kerberos. If you log into sql server and check the sessions via sys.dm_exec_connections you should see a list of sessions with NTLM

NTLM Sessions

(you can also use klist sessions from the cli to view the sessions)

We were able to correlate our errors with the password changes with log analytics tools, so we knew that was the culprit. I do not know how often the SCM refreshes its copy of the password but if the service is authenticating to sql server and using Kerberos the I believe password changes should be independent of the Kerberos session lifetime/renewal so the generated error is a solid clue that the password is being sent to the sql server host via NTLM. Once we fixed our SPN issue (which was due to an additional DNS A record) the sessions switched over to Kerberos authentication.