SQL Failover Cluster Clients not updating DNS

domain-name-systemfailoverclustersqlssrssubnet

I have a multi-subnet SQL Server 2012 failover cluster running on 2 Windows 2012 R2 nodes. The cluster is located on 192.168.1.10 when on NODE A and 192.168.5.10 when running on NODE B. When I look at DNS on the domain, I see 2 A records:

SQLCLUSTER – 192.168.1.10

SQLCLUSTER – 192.168.5.10

From what I've read when there's 2 A records with the same name, Windows clients should be smart enough to determine the live IP within 10 seconds, but I'm not getting that. If I fail the cluster over from NODE A to NODE B, many of the client machines (web and SSRS) will not pick up the new IP address.

So for example, if I'm on the SSRS machine and ping SQLCLUSTER, I'll see it reply from 192.168.1.10. If I fail the cluster over to NODE B and ping again, it's still trying to ping 192.168.1.10, not .5.10 as it should.

It seems to be the only way to make sure it works properly is to delete the DNS record of the offline node then do a flushdns/registerdns on the clients. Is there possibly something I missed here? Is it an issue that the DNS server is on the 192.168.1.x subnet and that may give precedence to the .1.x addresses?

I look at the event viewer and don't see any errors regarding writing the DNS records or reading them, so it's leading me to believe that I might just have something configured improperly.

Best Answer

Ok, I think I found a workaround - it's not exactly pretty, but it does work.

From information found here: https://blogs.msdn.microsoft.com/sambetts/2014/02/04/multi-subnet-clustered-sql-registerallprovidersip-sharepoint-2013/

I set RegisterAllProvidersIP to 0 then set the HostRecordTTL to 300 and restarted the role.

This will force the cluster to only register 1 address in DNS and default its TTL to 5 minutes.

Related Topic