Ssh – Can’t complete SSH connection after successfully exchanging keys to Ubuntu from some networks

ssh

After flying from one country to another, I now can't ssh to several of my Digital Ocean Ubuntu servers. However, I can still log in via console and ssh from one box to another (they are all in the same physical data center).

When running ssh with -vvvv and running the time command with it, the last debug message is:

debug2: channel 0: open confirm rwindow 0 rmax 32768
Write failed: Broken pipe

It times out after 1 minute 37 seconds.

Here is the debug log from the point at which ssh key authentication succeeds:

debug1: Authentication succeeded (publickey).
Authenticated to 128.199.170.168 ([128.199.170.168]:22).
debug1: channel 0: new [client-session]
debug3: ssh_session2_open: channel_new: 0
debug2: channel 0: send open
debug1: Requesting no-more-sessions@openssh.com
debug1: Entering interactive session.
debug2: callback start
debug2: fd 3 setting TCP_NODELAY
debug3: packet_set_tos: set IP_TOS 0x10
debug2: client_session2_setup: id 0
debug2: channel 0: request pty-req confirm 1
debug1: Sending environment.
debug3: Ignored env TERM_PROGRAM
debug3: Ignored env SHELL
debug3: Ignored env TERM
debug3: Ignored env TMPDIR
debug3: Ignored env Apple_PubSub_Socket_Render
debug3: Ignored env TERM_PROGRAM_VERSION
debug3: Ignored env TERM_SESSION_ID
debug3: Ignored env USER
debug3: Ignored env SSH_AUTH_SOCK
debug3: Ignored env __CF_USER_TEXT_ENCODING
debug3: Ignored env PATH
debug3: Ignored env MARKPATH
debug3: Ignored env PWD
debug1: Sending env LANG = en_US.UTF-8
debug2: channel 0: request env confirm 0
debug3: Ignored env XPC_FLAGS
debug3: Ignored env PS1
debug3: Ignored env XPC_SERVICE_NAME
debug3: Ignored env SHLVL
debug3: Ignored env HOME
debug3: Ignored env GREP_OPTIONS
debug3: Ignored env LOGNAME
debug3: Ignored env SCALA_HOME
debug3: Ignored env SECURITYSESSIONID
debug3: Ignored env _
debug2: channel 0: request shell confirm 1
debug2: callback done
debug2: channel 0: open confirm rwindow 0 rmax 32768
Write failed: Broken pipe

The connection isn't particularly slow, my shell is bash (and I can still login via console and other network ssh). Nothing seems to be blocking the ssh connection since I see public key auth taking place.

I don't know which pipe is being written to that is broken. FWIW I'm connecting from OSX but I had no problems until flying to the US.


Here is what auth.log shows when trying to login:

May 17 12:28:01 db1 CRON[24931]: pam_unix(cron:session): session opened for user root by (uid=0)
May 17 12:28:01 db1 CRON[24931]: pam_unix(cron:session): session closed for user root
May 17 12:28:02 db1 sshd[24955]: error: Could not load host key: /etc/ssh/ssh_host_ed25519_key
May 17 12:28:04 db1 sshd[24955]: Accepted publickey for tomo from 24.210.28.151 port 63202 ssh2: DSA 3a:[redacted]
May 17 12:28:04 db1 sshd[24955]: pam_unix(sshd:session): session opened for user tomo by (uid=0)

Tcpdump capture of port 22 traffic during a connection attempt:

    $ sudo tcpdump -i en0 port 22
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on en0, link-type EN10MB (Ethernet), capture size 65535 bytes
19:00:40.917870 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [S], seq 3430788632, win 65535, options [mss 1460,nop,wscale 5,nop,nop,TS val 1286503697 ecr 0,sackOK,eol], length 0
19:00:41.211348 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [S.], seq 4135716624, ack 3430788633, win 28960, options [mss 1460,sackOK,TS val 898678531 ecr 1286503697,nop,wscale 8], length 0
19:00:41.211415 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [.], ack 1, win 4117, options [nop,nop,TS val 1286503989 ecr 898678531], length 0
19:00:41.215051 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 1:22, ack 1, win 4117, options [nop,nop,TS val 1286503992 ecr 898678531], length 21
19:00:41.484824 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [.], ack 22, win 114, options [nop,nop,TS val 898678606 ecr 1286503992], length 0
19:00:41.488532 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [P.], seq 1:42, ack 22, win 114, options [nop,nop,TS val 898678609 ecr 1286503992], length 41
19:00:41.488616 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [.], ack 42, win 4116, options [nop,nop,TS val 1286504260 ecr 898678609], length 0
19:00:41.490182 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [.], seq 22:1470, ack 42, win 4116, options [nop,nop,TS val 1286504261 ecr 898678609], length 1448
19:00:41.490183 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 1470:1614, ack 42, win 4116, options [nop,nop,TS val 1286504261 ecr 898678609], length 144
19:00:41.491254 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [.], seq 42:1490, ack 22, win 114, options [nop,nop,TS val 898678609 ecr 1286503992], length 1448
19:00:41.592287 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [.], ack 1490, win 4096, options [nop,nop,TS val 1286504362 ecr 898678609], length 0
19:00:41.760341 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [P.], seq 1490:1674, ack 22, win 114, options [nop,nop,TS val 898678676 ecr 1286504260], length 184
19:00:41.760401 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [.], ack 1674, win 4090, options [nop,nop,TS val 1286504527 ecr 898678676], length 0
19:00:41.762375 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [.], ack 1614, win 136, options [nop,nop,TS val 898678676 ecr 1286504261], length 0
19:00:41.762409 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 1614:1638, ack 1674, win 4096, options [nop,nop,TS val 1286504529 ecr 898678676], length 24
19:00:42.027042 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [P.], seq 1674:1826, ack 1638, win 136, options [nop,nop,TS val 898678743 ecr 1286504529], length 152
19:00:42.027103 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [.], ack 1826, win 4091, options [nop,nop,TS val 1286504789 ecr 898678743], length 0
19:00:42.028104 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 1638:1782, ack 1826, win 4096, options [nop,nop,TS val 1286504790 ecr 898678743], length 144
19:00:42.300304 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [P.], seq 1826:2546, ack 1782, win 148, options [nop,nop,TS val 898678812 ecr 1286504790], length 720
19:00:42.300357 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [.], ack 2546, win 4073, options [nop,nop,TS val 1286505053 ecr 898678812], length 0
19:00:42.302441 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 1782:1798, ack 2546, win 4096, options [nop,nop,TS val 1286505055 ecr 898678812], length 16
19:00:42.600776 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [.], ack 1798, win 148, options [nop,nop,TS val 898678888 ecr 1286505055], length 0
19:00:42.600843 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 1798:1850, ack 2546, win 4096, options [nop,nop,TS val 1286505349 ecr 898678888], length 52
19:00:42.857852 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [.], ack 1850, win 148, options [nop,nop,TS val 898678952 ecr 1286505349], length 0
19:00:42.858552 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [P.], seq 2546:2598, ack 1850, win 148, options [nop,nop,TS val 898678952 ecr 1286505349], length 52
19:00:42.858584 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [.], ack 2598, win 4094, options [nop,nop,TS val 1286505604 ecr 898678952], length 0
19:00:42.859131 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 1850:1918, ack 2598, win 4096, options [nop,nop,TS val 1286505605 ecr 898678952], length 68
19:00:43.124310 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [P.], seq 2598:2650, ack 1918, win 148, options [nop,nop,TS val 898679019 ecr 1286505605], length 52
19:00:43.124374 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [.], ack 2650, win 4094, options [nop,nop,TS val 1286505867 ecr 898679019], length 0
19:00:43.124473 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 1918:2434, ack 2650, win 4096, options [nop,nop,TS val 1286505867 ecr 898679019], length 516
19:00:43.394690 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [P.], seq 2650:2702, ack 2434, win 159, options [nop,nop,TS val 898679086 ecr 1286505867], length 52
19:00:43.394774 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [.], ack 2702, win 4094, options [nop,nop,TS val 1286506134 ecr 898679086], length 0
19:01:04.685580 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2434:2582, ack 2702, win 4096, options [nop,nop,TS val 1286527239 ecr 898679086], length 148
19:01:04.966270 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [P.], seq 2702:2738, ack 2582, win 170, options [nop,nop,TS val 898684479 ecr 1286527239], length 36
19:01:04.966378 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [.], ack 2738, win 4094, options [nop,nop,TS val 1286527514 ecr 898684479], length 0
19:01:04.967018 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2582:2702, ack 2738, win 4096, options [nop,nop,TS val 1286527514 ecr 898684479], length 120
19:01:05.269214 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [.], ack 2702, win 170, options [nop,nop,TS val 898684555 ecr 1286527514], length 0
19:01:06.027067 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [P.], seq 2738:2790, ack 2702, win 170, options [nop,nop,TS val 898684744 ecr 1286527514], length 52
19:01:06.027144 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [.], ack 2790, win 4094, options [nop,nop,TS val 1286528563 ecr 898684744], length 0
19:01:06.027497 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286528563 ecr 898684744], length 460
19:01:06.603432 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286529135 ecr 898684744], length 460
19:01:07.552730 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286530077 ecr 898684744], length 460
19:01:09.250116 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286531762 ecr 898684744], length 460
19:01:12.442790 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286534930 ecr 898684744], length 460
19:01:18.634929 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286541067 ecr 898684744], length 460
19:01:24.068621 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286546451 ecr 898684744], length 460
19:01:34.714519 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286557019 ecr 898684744], length 460
19:01:45.384050 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286567587 ecr 898684744], length 460
19:01:56.051835 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286578155 ecr 898684744], length 460
19:02:06.715163 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286588723 ecr 898684744], length 460
19:02:17.355823 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286599291 ecr 898684744], length 460
19:02:28.042962 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096, options [nop,nop,TS val 1286609859 ecr 898684744], length 460
19:02:38.690971 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [R.], seq 3162, ack 2790, win 4096, length 0

Some other things I've tried:

  • reducing mtu on the server, could be failure in pmtu: sudo ip link set mtu 1280 dev eth0
  • reducing mtu to 1280 in OS X for my wifi interface
  • reducing ServerAliveInterval even lower, to 30, where the connection still times out but not with a broken pipe
  • running ssh with "cat" instead of "bash" or bash but with no profile/rc loaded
  • setting OS X wifi interface IP address manually instead of dhcp

Best Answer

In the packet trace we see that maximum sized packets are exchanged in both directions early on during the flow. That did not cause any problems, so there is nothing suggesting a MTU issue.

Later on during the connection we see that a packet from the client to the server with relative sequence numbers 2702:3162 never receive an ACK from the server.

My immediate thought is that this packet loss is caused by a faulty middlebox (i.e. NAT, firewall, or similar).

I have heard talk about NAT boxes which cannot handle change of TOS during a TCP connection. The problem in your case does happen after the client indicates that TOS has been changed. However since tcpdump doesn't show the TOS, I can't say for sure if that is the exact point where the problem happens.

For a test you can try using -o ProxyCommand='nc %h %p' such that the ssh client doesn't directly control the TCP connection. You can also try the IPQoS option. If the change of TOS is the issue, then specifying -o IPQoS=cs0 or -o IPQoS=0 should work but any other setting would fail. This is because ssh is using 0 as the QoS during authentication and then switch to the chosen QoS after authentication. By choosing QoS to be 0 there won't be any change of QoS value to confuse middleboxes.

Related Topic