We set up a new server here a few weeks ago that I am informally responsible for managing.
Almost everything works perfectly except for one thing: Every so often it hangs without warning.
Some facts about this hang:
- It is not a single application or service; the entire system is non-responsive.
- Nothing is displayed (monitor acts as though there's no VGA signal).
- The power LED is on and the fans are running.
- Pressing the power button does nothing (normally it would shut the machine down).
- Pings generally time out; once it did respond, another time I got "destination host unreachable".
- Event logs show nothing (literally nothing at all) from before the hang until the hard reboot.
- There are no performance problems, strange errors, or other obvious signs of impending doom leading up to the eventual hang.
- The machine is generally not heavily loaded (it's for development, not production), and the hangs appear to be occurring at non-peak times of day (between midnight and 6 AM).
Some additional facts about the machine/environment:
- Windows Server 2008 R2
- Running SQL Server 2008 and IIS (not much else)
- All drivers up to date, patches installed, etc.
- No vendor-supplied diagnostics (not "top tier").
- The machine is completely new, not merely reformatted or repurposed. No recent changes although the machine is less than a month old to start with.
I don't expect any easy answers here. What I'd like to know his I can methodically determine the root cause of this problem, be it a misbehaving service, defective hardware, or something else.
Is there any kind of logging I can set up that will help me get to the bottom of this? Any hardware diagnostics or remote monitoring? Anything else I can do to help me discover what's actually happening, or at least be able to eliminate what isn't wrong?
Just to reiterate, I really don't want to start speculating about possible causes and take a trial-and-error approach, because it's going to be at least several days at a time before I would have conclusive results. I'm looking for solutions to reliably trace the problem to its source.
Best Answer
good place to start
http://blogs.technet.com/b/askperf/archive/2007/09/25/troubleshooting-server-hangs-part-one.aspx