Ssh – How to debug SSH “connection timeout”

sshtcpdumptimeout

I'm getting error (sometimes) while connecting from host1 to host2

Errno::ETIMEDOUT: Connection timed out - connect(2)

TCP dump on host1 while trying to connect (tcpdump -vv -i eth0 -s 0 'port 22 and host host2'):

19:13:47.510774 IP (tos 0x0, ttl 64, id 44238, offset 0, flags [DF], proto TCP (6), length 60)
    host1.50274 > host2.ssh: Flags [S], cksum 0x1409 (correct), seq 2693070134, win 5840, options [mss 1460,sackOK,TS val 867914232 ecr 0,nop,wscale 7], length 0
19:13:50.508713 IP (tos 0x0, ttl 64, id 44239, offset 0, flags [DF], proto TCP (6), length 60)
    host1.50274 > host2.ssh: Flags [S], cksum 0x111b (correct), seq 2693070134, win 5840, options [mss 1460,sackOK,TS val 867914982 ecr 0,nop,wscale 7], length 0
19:13:56.508707 IP (tos 0x0, ttl 64, id 44240, offset 0, flags [DF], proto TCP (6), length 60)
    host1.50274 > host2.ssh: Flags [S], cksum 0x0b3f (correct), seq 2693070134, win 5840, options [mss 1460,sackOK,TS val 867916482 ecr 0,nop,wscale 7], length 0

On host2 at the same time (tcpdump -vv -i eth0 -s 0 'port 22 and host host1'):

19:13:47.510512 IP (tos 0x0, ttl 62, id 44238, offset 0, flags [DF], proto TCP (6), length 60)
    host1.50274 > host2.ssh: Flags [S], cksum 0x1409 (correct), seq 2693070134, win 5840, options [mss 1460,sackOK,TS val 867914232 ecr 0,nop,wscale 7], length 0
19:13:50.508453 IP (tos 0x0, ttl 62, id 44239, offset 0, flags [DF], proto TCP (6), length 60)
    host1.50274 > host2.ssh: Flags [S], cksum 0x111b (correct), seq 2693070134, win 5840, options [mss 1460,sackOK,TS val 867914982 ecr 0,nop,wscale 7], length 0
19:13:56.508447 IP (tos 0x0, ttl 62, id 44240, offset 0, flags [DF], proto TCP (6), length 60)
    host1.50274 > host2.ssh: Flags [S], cksum 0x0b3f (correct), seq 2693070134, win 5840, options [mss 1460,sackOK,TS val 867916482 ecr 0,nop,wscale 7], length 0

Where can be the problem? How can I debug it?

Best Answer

I suggest to test the network with 'ping -s 1450' (-s changes the size of the probing packet, mtr does not seem to have a similar option). If the first packets succeed, but not the others, it is often because the problem is linked to the packet size (for instance, on a troubled medium such as an outside radio link under heavy rain), the loss rate can be almost null for small packets and close to 1 for large packets. And most protocols start with small negotiation packets then switch to large packets.

I noticed that you say that HTTP is fine, which would disproves my hypothesis but I have no other idea right now.