Windows 2003 Terminal Server not responding after reboots

terminal-serverwindows-server-2003

We have 5 Windows 2003 R2 SP2 Std x64 terminal servers that are set to reboot each night all within 45 mins of each other. Frequently at least one of them will not respond to RDP requests after reboot. If I connect to console I can login just fine. Netstat shows TS listening on 3389 etc. Only way I am able to get them to respond again is to reboot manually.

All terminal servers show the following errors in the event log after reboots (however not all of them arent responding, most work fine post reboot)

Event ID 5719 - Error - Netlogon - This computer was not able to set up a secure session with a domain controller in domain DOMAIN due to the following: There are no logon servers available.

Event ID 4321 - Error - NetBT - The name DOMAIN :1d" could not be registered on the interface with IP address [IP address]. The machine with the IP [IP address of domain controller] did not allow the name to be claimed by this machine.

However, those events show up on the machines that successfully reboot as well. Can someone please assist me with troubleshooting this problem? Like I said, it doesnt happen every time or on every server. Only sometimes are one or two servers. Very frustrating.

Thanks for any assistance!

Best Answer

Sounds like a problem with TS services on the impacted servers. Maybe they're hung, or waiting on a response from the DC that got lost or garbled on the network, or failed to start correctly when the OS booted, etc.

  1. First thing I'd do is set the TS services to delayed start up, in case it's an OS or machine boot issue. It'll set the service to start up after most everything else, so any dependencies should be fully started and there won't be any conflicts with it started at the same time as whatever else.
  2. Failing that, I'd use a scheduled task to restart the service a couple minutes after the OS boots up. (Would take a little bit of guesswork to schedule it right, based on reboot time, machine boot speed and OS load speed.)
  3. Investigate the NICs on the machines? Is it possible that the cause is outdated drivers or firmware, and updated software (like Windows Updates and any other patches you've [hopefully] applied) conflicting with each other from time to time?
  4. Failing that (and maybe anyway, to try to resolve the root cause, rather than just alleviate the symptom), I'd do a reinstall (uninstall, install fresh) of the Terminal Services on the impacted servers. I've had this kind of issue, absent EventID 4321 and that usually resolves it, at least when it's a problem with the TS services on the server, and not caused by networking or domain controller issue.
  5. (Maybe do this before #4) Troubleshoot this from the Domain Controller. There is a reason that the Eventlog is telling you the server can't contact a logon server and the Domain Controller isn't allowing the hostname to be assigned to the indicated interface. This can be caused by domain or Domain Controller settings. Look on the DC to see if there are any indications of that. (Don't forget to look for GPO settings, startup scripts and the like too.)
  6. (Maybe do this before #4 too) Troubleshoot this from a networking perspective. Is it possible the network is occasionally mangling the traffic between these servers and the DC, causing the authentication and name assignment problems you're seeing in the server Eventlogs.
  7. (Maybe do this before anything) Try to convince your bosses (or whoever does "control" the nightly reboots) that the nightly reboots are what's causing this, and/or that this is "expected behavior" when engaging in the dumbassed practice of nightly server reboots. Or you if you fix/fixed it, that the fix will stop working unless the reboots stop or decrease in frequency. You'll get the added benefit of not having to replace your servers in a couple years after the added stress of booting causes a hardware fault. :/