Some DHCP clients end up with wrong DNS server

dhcpdomain-name-systemwindows-server-2008-r2

The scenario:

  • DC running Windows Server 2008 R2 providing DNS + DHCP
  • Cisco 1811 Router as the gateway
  • 30 Windows XP DHCP clients on the LAN

The problem:

  • Some workstations are spontaneously switching to an incorrect DNS server. Specifically, ipconfig /all shows that they start using the gateway as a DNS server.
  • This happens about 5-10 times a day to various computers, sometimes more than once per day.

The workaround:

  • Repairing the connection on the XP client always fixes the problem, and the correct DNS server address is obtained.

We lost our main DNS/DHCP machine a week ago, and had to bring this one online as a spare. We've been having this issue since then. DHCP leases on the old and new servers are configured for "wired" (8 day) duration. There are definitely no other DHCP servers active on the LAN. So far there is no discernible pattern about which clients will show this problem, or when.

When I ran DCDIAG /test:DNS it came back clean. Manual inspection of the DNS zone shows that all the records are appearing as expected, with no traces of the previous machine in there.

Update Feb 27: Added screenshots.

Here is a screenshot of the DHCP scope options on the 2008 R2 server.
http://nicwaller.com/screens/dhcpscope.png

And here is a screenshot of ipconfig /all running on a healthy host. I don't have any ailing hosts at the moment, but will grab a screencap next time it happens.
http://nicwaller.com/screens/ipconfigall.png

Update Feb 28: More screenshots.

Here's a screenshot of DHCP and DNS traffic from a healthy client when repairing the local area connection. There's definitely only one server responding, but it does seem strange that the negotiation takes place twice. I'll try to get a similar capture from a sick machine this coming week.
http://nicwaller.com/screens/dhcprenew_screen.png

Update Mar 01: Caught a bad ipconfig.

Here's a screenshot of ipconfig /all from a client that had this issue. It says the lease was issued this morning, but it doesn't even have an entry for the secondary DNS I set up yesterday. Both DNS servers were discovered properly when repairing the connection.
http://nicwaller.com/screens/bad_dns.png

Update Mar 01: It even got the sysadmin!

This issue finally affected my personal workstation this morning. Unfortunately I had just rebooted and wasn't running a packet dump at the time. I set up a secondary server yesterday, and was logging all DNS traffic to it. My machine had not contacted the secondary DNS in over half an hour, so that says to me that it's just spontaneously reverting to the gateway without even failing over to secondary DNS first.

Today I swapped the order of the DNS servers in DHCP, so the secondary is primary and vice versa. I will update again once I know how that goes.

Best Answer

I would run a packet dump on a few of these boxes until it happens. See if you can find anything network related. Maybe you will see some packets that give you an idea if it is not that.

Can a group policy in Windows set the DNS server. Maybe somehow there has been a strange GP applied on the domain?

Update:
I have never done this, but since it seems like you are getting a little desperate, what about blowing away the current DHCP database. These instructions say how to back up the mdb file, so maybe moving it somehwere else will make it so DHCP creates a new one after restarting. That might fix the problem...

The thing that doesn't jive in my mind, :-), is why clients would be getting new information if their lease hasn't expired yet and they haven't rebooted... is this what is happening?