Ubuntu – very slow connection to ssh server from client (but not other servers)

connectionsshUbuntu

I have an Ubuntu 12.04 laptop that is taking so long to connect to various servers (in different data centres) that it seems like a bit of a lottery whether I'll actually get a connection. If I connect to the servers between themselves it's instantaneous, and I've set

UseDNS no
AddressFamily inet

On the servers I'm connecting to (and rebooted for good measure). I also put in the reverse DNS+IP of the cable connection I'm connecting from. If I connect from the laptop via telnet:

telnet my.server 22

Then the connection is also instantaneous, so it doesn't appear to be a problem with an intervening firewall. I have the same behaviour whether I connect with the IP, a short name in my hosts or the FQDN. I'm connecting with a 50mbps (cable, sync) connection so that doesn't appear to be the problem, and when I do finally get a connection then it's a good, quick, stable one. I have tried listening on another port (8000) and that makes no difference. Web and other connections from the laptop to the machine are also very good.

If I increase logging then I get the following before it hangs:

$ ssh -vvv flip
OpenSSH_5.9p1 Debian-5ubuntu1.1, OpenSSL 1.0.1 14 Mar 2012
debug1: Reading configuration data /home/anton/.ssh/config
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 19: Applying options for *
debug2: ssh_connect: needpriv 0
debug1: Connecting to flip [xxx.xxx.xxx.xxx] port 22.
debug1: Connection established.
debug3: Incorrect RSA1 identifier
debug3: Could not load "/home/anton/.ssh/id_rsa" as a RSA1 public key
debug1: identity file /home/anton/.ssh/id_rsa type 1
debug1: Checking blacklist file /usr/share/ssh/blacklist.RSA-2048
debug1: Checking blacklist file /etc/ssh/blacklist.RSA-2048
debug1: identity file /home/anton/.ssh/id_rsa-cert type -1
debug1: identity file /home/anton/.ssh/id_dsa type -1
debug1: identity file /home/anton/.ssh/id_dsa-cert type -1
debug1: identity file /home/anton/.ssh/id_ecdsa type -1
debug1: identity file /home/anton/.ssh/id_ecdsa-cert type -1

It's hanging on the server between the following two lines:

Nov  6 13:51:57 srv sshd[18472]: Connection from XXX.XXX.XXX.XXX port 51099
Nov  6 13:53:03 srv sshd[18472]: debug1: Client protocol version 2.0; client software version OpenSSH_5.9p1 Debian-5ubuntu1.1

It's a least quicker than yesterday though!

Does anyone have any ideas here?

Best Answer

Those symptoms are what you could expect to see in case of broken PMTU discovery. The client can connect and version information can be exchanged without problems because all the packets are small.

But once the key exchange starts larger packets are sent. If larger packets are silently dropped by some intermediate router without sending the ICMP error message required by the standard, the sender will never know the data has to be sent in smaller segments. Hence the connection stalls on the first large packet.

If this is indeed the problem, then lowering the MSS or the MTU can work around the problem. The first step could be to modify the used routing table entry in each end of the connection to include advmss 1220. Or if you don't want to modify the default route, you could simply add a more specific route with the same gateway.

You mention that the problem disappeared by itself which is also not unlikely for an MTU problem since it can disappear when BGP decides to send your packets through another path that doesn't cross the problematic router, or it can happen due to the administrator responsible for the problematic router noticing and fixing the problem.