Linux – how to know if the server runs out of RAM before crashing down

centoslinuxmemoryramdisk

I have a server that keeps on crashing.
I know there are several causes for a server to crashes down.
But if the cause is that the system is running out of RAM before it crash down;
how should I confirm that is cause? What log files should I look? And what line/error mes should I look for?
I am running CentOS. With heavy usage of php parsing xml files over 2 gigabytes at most.
The server has 16GB RAM.

EDIT 1

[root@61540 ~]# free -m
             total       used       free     shared    buffers     cached
Mem:         16035       1526      14509          0         40       1002
-/+ buffers/cache:        483      15552
Swap:         8197          0       8197

EDIT 2
/var/log/messages

Feb 17 20:38:26 61540 syslogd 1.4.1: restart.
Feb 17 20:38:26 61540 proftpd[3896]: 66.90.101.85 - received SIGHUP -- master server reparsing configuration file
Feb 17 22:23:06 61540 avahi-daemon[3984]: recvmsg(): Resource temporarily unavailable
Feb 17 23:07:37 61540 proftpd[10620] - (Several lines of ftp session)
Feb 18 23:03:48 61540 syslogd 1.4.1: restart.
Feb 18 23:03:48 61540 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Feb 18 23:03:48 61540 kernel: Linux version 2.6.18-308.el5 (mockbuild@builder10.centos.org) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-52)) #1 SMP Tue Feb 21 20:06:06 EST 2012
Feb 18 23:03:48 61540 kernel: Command line: ro root=LABEL=/
Feb 18 23:03:48 61540 kernel: BIOS-provided physical RAM map:
Feb 18 23:03:48 61540 kernel:  BIOS-e820: 0000000000010000 - 000000000009a000 (usable)
Feb 18 23:03:48 61540 kernel:  BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
Feb 18 23:03:48 61540 kernel:  BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
Feb 18 23:03:48 61540 kernel:  BIOS-e820: 0000000000100000 - 00000000cfda0000 (usable)
Feb 18 23:03:48 61540 kernel:  BIOS-e820: 00000000cfda0000 - 00000000cfdd1000 (ACPI NVS)
Feb 18 23:03:48 61540 kernel:  BIOS-e820: 00000000cfdd1000 - 00000000cfe00000 (ACPI data)
Feb 18 23:03:48 61540 kernel:  BIOS-e820: 00000000cfe00000 - 00000000cff00000 (reserved)
Feb 18 23:03:48 61540 kernel:  BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
Feb 18 23:03:48 61540 kernel:  BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
Feb 18 23:03:48 61540 kernel:  BIOS-e820: 0000000100000000 - 000000042f000000 (usable)
Feb 18 23:03:48 61540 kernel: DMI 2.4 present.
Feb 18 23:03:48 61540 kernel: No NUMA configuration found
Feb 18 23:03:48 61540 kernel: Faking a node at 0000000000000000-000000042f000000
Feb 18 23:03:48 61540 kernel: Bootmem setup node 0 0000000000000000-000000042f000000
Feb 18 23:03:48 61540 kernel: Memory for crash kernel (0x0 to 0x0) notwithin permissible range
Feb 18 23:03:48 61540 kernel: disabling kdump
Feb 18 23:03:48 61540 kernel: ACPI: PM-Timer IO Port: 0x808
Feb 18 23:03:48 61540 kernel: ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Feb 18 23:03:48 61540 kernel: Processor #0 5:1 APIC version 16
Feb 18 23:03:48 61540 kernel: ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
Feb 18 23:03:48 61540 kernel: Processor #1 5:1 APIC version 16
Feb 18 23:03:48 61540 kernel: ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
Feb 18 23:03:48 61540 kernel: Processor #2 5:1 APIC version 16
Feb 18 23:03:48 61540 kernel: ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] enabled)
Feb 18 23:03:48 61540 kernel: Processor #3 5:1 APIC version 16
Feb 18 23:03:48 61540 kernel: ACPI: LAPIC (acpi_id[0x04] lapic_id[0x04] enabled)
Feb 18 23:03:48 61540 kernel: Processor #4 5:1 APIC version 16
Feb 18 23:03:48 61540 kernel: ACPI: LAPIC (acpi_id[0x05] lapic_id[0x05] enabled)
Feb 18 23:03:48 61540 kernel: Processor #5 5:1 APIC version 16
Feb 18 23:03:48 61540 kernel: ACPI: LAPIC (acpi_id[0x06] lapic_id[0x06] enabled)
Feb 18 23:03:48 61540 kernel: Processor #6 5:1 APIC version 16
Feb 18 23:03:48 61540 kernel: ACPI: LAPIC (acpi_id[0x07] lapic_id[0x07] enabled)
Feb 18 23:03:48 61540 kernel: Processor #7 5:1 APIC version 16

Best Answer

You should check /var/log/messages The dmesg command will not be useful in this case because it only shows you the kernel messages since last boot.

"Running out of memory" is not usually enough to completely crash Linux. Linux will start killing processes when it runs out of memory (OOM killer). So you would probably look for some kernel panic. If you're using less to read the logs, you can search pressing the / key.

But the bottom line is: you should first read /var/log/messages. It is ordered by time, so it's easy to find the moment when the server last booted. Check what happened before that, which caused your server to crash.

Related Topic