Linux – Determining Cause of System Reboot

fedoralinux

I have a Fedora 13 virtual machine running on top of VMWare's Hypervisor. Around 6am, it mysteriously rebooted, screwing up some long-running data import processes I was running. I've been checking, but I'm having a hard time finding out why it rebooted. What's the best approach to find out the cause of a reboot?

The machine is in a locked server rack, so it's unlikely someone manually rebooted the hardware. I'm the only one with SSH access to the machine, so it's unlikely someone rebooted the VM remotely. And there are other VMs running on Hypervisor that weren't rebooted, so it's unlikely to be caused by a power failure or reboot of the entire hardware.

Running who -b tells me when the reboot occurred:

~$ who -b
system boot  2011-12-22 06:02

Running crontab -l shows no cron jobs that would trigger a reboot.

Reviewing the historical resource usage graphs in Hypervisor's vSphere client shows that the machine had at most 5% CPU usage for several hours before the reboot, so it wasn't under any unusual load.

Unfortunately, checking /var/log/messages around the time of the reboot only shows:

Dec 22 03:50:01 myserver pcscd: winscard.c:309:SCardConnect() Reader E-Gate 0 0 Not Found
Dec 22 03:50:01 myserver pcscd: winscard.c:309:SCardConnect() Reader E-Gate 0 0 Not Found
Dec 22 03:50:01 myserver pcscd: winscard.c:309:SCardConnect() Reader E-Gate 0 0 Not Found
Dec 22 03:50:01 myserver pcscd: winscard.c:309:SCardConnect() Reader E-Gate 0 0 Not Found
Dec 22 06:02:38 myserver kernel: imklog 4.4.2, log source = /proc/kmsg started.
Dec 22 06:02:38 myserver rsyslogd: [origin software="rsyslogd" swVersion="4.4.2" x-pid="1138" x-info="http://www.rsyslog.com"] (re)start
Dec 22 06:02:38 myserver kernel: Initializing cgroup subsys cpuset
Dec 22 06:02:38 myserver kernel: Initializing cgroup subsys cpu
Dec 22 06:02:38 myserver kernel: Linux version 2.6.34.7-56.fc13.x86_64 (mockbuild@x86-03.phx2.fedoraproject.org) (gcc version 4.4.4 20100630 (Red Hat 4.4.4-10) (GCC) ) #1 SMP Wed Sep 15 03:36:55 UTC 2010
Dec 22 06:02:38 myserver kernel: Command line: ro root=/dev/mapper/VolGroup-lv_root rd_LVM_LV=VolGroup/lv_root rd_LVM_LV=VolGroup/lv_swap rd_NO_LUKS rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us rhgb quiet
Dec 22 06:02:38 myserver kernel: BIOS-provided physical RAM map:

So basically, nothing happened for about 2 hours, and then the machine suddenly rebooted.

Does this mean the kernel crashed? How would I confirm this? Are there any other logs I should be looking at?

Best Answer

If there were a kernel issue and you have kernel dumps setup than there should be a dump file somewhere. Of couse, you this needed to be setup before the crash! You're probably aware by now that ESX/i is structurally similar to Linux so log files will be in roughly the same locations. A good overview is here, http://www.vmwarewolf.com/which-esx-log-file/ There are also various different methods of parsing/viewing ESX/i log files. http://www.simonlong.co.uk/blog/2010/06/03/vmware-esxi-4-log-files/