Linux Server does not respond to TCP connections after some time running. How to analyse

linuxnetworkingtcpUbuntu

My Ubuntu 11.04 server on the internet has some strange behavior since a few days. It runs perfectly fine with some Java web applications. Then, suddenly it does not accept connections anymore. When I try to ssh or to http-connect my server I get no response, until I get timeout. But ping works perfectly. nmap also works:

Starting Nmap 5.21 ( http://nmap.org ) at 2011-08-29 10:52 CEST
Nmap scan report for ...
Host is up (0.020s latency).
Not shown: 994 closed ports
PORT     STATE SERVICE
22/tcp   open  ssh
25/tcp   open  smtp
53/tcp   open  domain
443/tcp  open  https
3000/tcp open  ppp
3128/tcp open  squid-http

After reboot, everything works again for some hours.

What could this be? Or how to analyse this problem?

Best Answer

This really does look like you are running out of memory, with no swap on the system. If a linux system runs out of memory, it cannot accept TCP connections anymore because the connection needs memory to be established. ICMP might not need anything since there is not state to maintain.

Check your memory settings everywhere, and make sure you do not allocate more than 70% of the total memory to the JVM (-Xms and -Xmx options).

Activate a swap if not yet done, you can create a basic swap file somewhere on the disk:

dd if=/dev/zero of=/mnt/swapfile bs=1M count=10240
mkswap /mnt/swapfile
swapon /mnt/swapfile

If after that your system hangs again, it's time for some low level monitoring.