DHCP failover didn’t work when one server was offline

dhcpwindows-server-2012

I'm running two DHCP servers both with Windows Server 2012. They are set up as with high availability using "load balanced, split 50/50." They named DC1 and DC2.

This morning, DC1 was frozen, there was a black screen and the computer was completely unresponsive. There is nothing in the event log that suggests what went wrong. I ended up having to hard reset it. It seemed that DC2 was not picking up the slack while DC1 was down. I noticed because a new computer I plugged into the network was not being assigned an IP address. Also, address renewal had failed on another computer, this was in the event log:

Your computer was not able to renew its address from the network (from the DHCP Server) for the Network Card with network address XXX

Your computer has lost the lease to its IP address 192.168.1.XXX on the Network Card with network address XXX.

If DC1 is not working, shouldn't DC2 automatically start issuing IPs and addressing renewal requests? Or is this something that has to be configured somewhere? I'm not very familiar with how this works, so any assistance would be appreciated.

I'm wondering if this was because DC1 wasn't actually unplugged or turned off, just frozen?

Best Answer

If DC1 is not working, shouldn't DC2 automatically start issuing IPs and addressing renewal requests?

Yes. In a Load Balanced DHCP Failover relationship the DHCP partner server will assign a DHCP lease for its unavailable partner.

How it works:

DHCP Failover partnerships create a hash of the client MAC address when a lease request is recieved. Both servers will know which hashes it needs to respond to and both servers should be able to see requests. If a request goes unanswered after a handful of tries from the client, the partner server will respond with a short term lease from its IP pool equal to the MCLT length. (So it can expire the lease quickly when the unavailable server is back up). Server 2012 Reference and Obscure DHCP Failover ietf Reference

DHCP requests come about 1 second apart and this entire process usually isn't noticeably longer (to a human) than receiving a standard DHCP lease.

is this something that has to be configured somewhere?

You should of course check your configuration if you are not familiar with it. Going through a good how-to would be a good way to check your configuration. You can also do a quick check on a scope from the DHCP Snap-In by Right-Clicking the scope and selecting Display Statistics. This will show how many IP addresses are assigned and available from each failover partner.

DHCP Failover Scope Statistics

You can likewise view your partnership state by viewing the scopes properties and selecting the Failover tab.

DHCP Failover Scope Properties

I'm wondering if this was because DC1 wasn't actually unplugged or turned off, just frozen?

All these states will trigger the same Communtication-Interrupted state on the partner server. Your secondary partner should take over unanswered leasing regardless of why it cannot communicate with its partner.

Further Troubleshooting:

If your secondary server is still not able to serve leases in Communication-Interrupted state, your best bet is working with your network administrator. If you are he, I would suggest packet capturing DHCP traffic and checking for DHCP Helper Addresses corresponding to both DHCP servers on all intermediary switches.

Related Topic