All of a sudden can’t connect to aws ec2 instance (Server unexpectedly closed network connection)

amazon ec2amazon-web-services

everything waas working great with my instance, and all of a sudden i can no longer connect thru ssh or sftp, with an error "Server unexpectedly closed network connection".

My colleagues have the same issue !

every thing seem to work normally (webserver works fine)

I have launched another instance with the "Launch more like this" option, and everything is working fine. it might be related to a chmod on a key file ?! (this is a possible reason found after my researches) But i don't know how to check that !

I also tried to connect from another IP and another computer, same result. I have nothing particular in the logs (it looks like nothing was loged since the day the instance started).

How can i do ?!

Thanks in advance

Best Answer

Unfortunately I cannot comment with my rep so I'll post this as an answer.

I personally would take the same approach as @prateek61.

It's very difficult to diagnose this issue without being able to log in to your server. Linux is very good with not letting people in once the some configuration has changed or been triggered.

Since in AWS there is not console access, this is how I would investigate this assuming that you cannot shutdown the server:

If you can shutdown the server then you can jump to step 3 however I assumed that this is prod and you can't take it down.

  1. Create a snapshot of the volume
  2. Create a volume from the snapshot you have just took.
  3. Mount the new volume on a good working instance where you are able to ssh to.
  4. Once mounted go to the volume and investigate:
    • file permissions for the ssh keys
    • the keys itself are they correct
    • check sshd configuration, sudoers etc.
    • check the logs files etc.
    • check if you have firewall running on that server (as this is mounted volume you can't check this however you can check the /etc/rc3.d to see if your the symlink to your firewall is there)
    • sometimes fail2ban or other similar software is running and preventing you from accessing the server again check what is starting with the system in /etc/rc3.d.

After you find what is wrong you will need to figure out how to apply it to your running server i.e. swap volumes, create a new instance, re point the traffic etc. this depends on the purpose of the server and how much downtime you can afford and is a matter for another subject.

Once again I'm posting this as answer as I was not able to comment. If you want to downgrade this, please be king enough to provide a reason.

Related Topic