I have a problem with an EC2 instance. In the past the instance suddenly lost the connections via SSH and Web (Apache dead and the other servers), and I could not even make ping. However, when I went to my account in AWS console the status showed that the instance was running without any problems and was connected to the Web. I never could really know why that was happening and the only solution I found was to stop the instance for a few minutes from my AWS account's console and then start it again after a few minutes. After that if I tried to connect to the instance worked like it never happened.

To avoid all of these problems, I created a new instance from scratch and migrated all my projects to the new instance. At first everything worked fine, then I had MySQL errors and looking for information I found it was because there was no swap memory. To solve the problem I manually configured the swap memory and it solved my problem with MySQL, but this morning this new instance generated the same error I had in the original instance, was totally cut off and the AWS console was showing the status that the instance had not problems. I turned it off for a few minutes and turned it back on, and it solved the problem.

The command that I executed for configuring the swap was this one:

dd if=/dev/zero of=/swapfile bs=1M count=1024
mkswap /swapfile
swapon /swapfile

after add this line to file /etc/fstab

/swapfile swap swap defaults 0 0

finally run these commands

$ swapon -s   
$ free -m
$ swapoff -a
$ swapon  -a

Was this problem originated from the changes with the swap memory? This problem was never shown in this new instance. What should I do to solve these problems?

Best Answer

Given that you default to no swap on this instance, I'm going to assume your instance is of type 't1.micro'. The micro instances can burst up to 2 ECUs (EC2 Compute Units) for very short periods, but are allocated low average CPU resources. On bursting to high CPU levels for a short period, your CPU resources are pegged to a low baseline level for a time afterward. Depending on your workload during this period, it's not unusual for the instance to become uncontactable. Given time, the instance should become contactable again, As you've observed, stopping and starting the instance can also help.

More information on t1.micro instance resources is available here:

I suggest changing your instance type to m1.small for a while to verify the above. If this does then appear to be the issue, you will need to either upgrade your instance type permanently, or consider running a more lightweight stack, such as for example lighttpd and sqlite.

