I have had a problem in the last few days where every 10-15 minutes, my entire server goes non-responsive, shutting out all TCP connections, for about 3 minutes.
I finally found that the connections shut out because all 16 cores spike to a solid, stable 100% CPU for the duration of those 3 minutes.
I am actively trying to find out what is maxing out the CPU, however, as everything on the server totally freezes (even in the console), I can't check fast enough to find out what it is.
This is obviously a big problem and I need to get it handled right away. Is there some way to log this CPU spike and distinguish it from the rest of the traffic?
Best Answer
The only answer I can currently think of is a little bit hacky, but it might get you an answer. First is capturing the process causing the issue. Schedule something like this to run every minute in a command window:
Or you could schedule it and change the command line to pipe it >> into a file.
That will give you the CPU usage of all running processes. From there you could use a tool like ProcDump (http://technet.microsoft.com/en-us/sysinternals/dd996900) to monitor the troublesome application and dump information about it when the CPU hits a certain percentage used.
Hope this helps some.