Ubuntu – SSH through intermediate host fails only on theuser@themac but works elsewhere

bastionmac-osxsshUbuntu

I am not able to connect via ssh to one of my datacenter nodes using my user on my macbook. This is a recent problem, and it was perfectly funtional since ~ a couple of weeks ago.

Strangely, this only affects my user on my computer, but I am able to establish the connection from:

  • A different user in the same machine, using the same ssh keys and without any .ssh/config rule.
  • A different server, running either macos or ubuntu, with the same or different ssh keys.

Using my username in my computer, and same keys, I can:

  • Connect to the gateway host
  • Connect directly to the node using a VPN (unfortunately this is not a long-time solution)

I’m quite puzzled by this error. Can you help me locate the problem?

Looking at the logs, the connection with the gateway is established, but somehow fails at connecting to the node. On the client side:

⌘ ~ ❯ ssh -v -J gatekeeper@gateway ubuntu@node -i ~/.ssh/id_rsa 
OpenSSH_7.3p1, LibreSSL 2.4.1
[...]
debug1: Authentication succeeded (publickey).
Authenticated to gateway ([35.156.248.245]:22).
debug1: channel_connect_stdio_fwd node:22
debug1: channel 0: new [stdio-forward]
debug1: getpeername failed: Bad file descriptor
debug1: Requesting no-more-sessions@openssh.com
debug1: Entering interactive session.
debug1: pledge: network
debug1: client_input_global_request: rtype keepalive@openssh.com want_reply 1
debug1: client_input_global_request: rtype keepalive@openssh.com want_reply 1
debug1: client_input_global_request: rtype keepalive@openssh.com want_reply 1
debug1: client_input_global_request: rtype keepalive@openssh.com want_reply 1
debug1: client_input_global_request: rtype keepalive@openssh.com want_reply 1
debug1: client_input_global_request: rtype keepalive@openssh.com want_reply 1
debug1: client_input_global_request: rtype keepalive@openssh.com want_reply 1
debug1: client_input_global_request: rtype keepalive@openssh.com want_reply 1
channel 0: open failed: connect failed: Connection timed out
stdio forwarding failed
ssh_exchange_identification: Connection closed by remote host

On the gateway side:

admin@gateway:~$ grep -e "\[7669\]" -e "\[7739\]" /var/log/auth.log
Mar 13 11:01:20 gateway sshd[7669]: Set /proc/self/oom_score_adj to 0
Mar 13 11:01:20 gateway sshd[7669]: rexec line 32: Deprecated option PermitBlacklistedKeys
Mar 13 11:01:20 gateway sshd[7669]: Connection from <laptop-out-ip> port 62113 on <gateway-ip> port 22
Mar 13 11:01:20 gateway sshd[7669]: Postponed publickey for gatekeeper from <laptop-out-ip> port 62113 ssh2 [preauth]
Mar 13 11:01:20 gateway sshd[7669]: Accepted publickey for gatekeeper from <laptop-out-ip> port 62113 ssh2: RSA 8d:7e:9c:53:11:c9:4d:b3:67:7b:ae:04:03:8f:e2:71
Mar 13 11:01:20 gateway sshd[7669]: pam_unix(sshd:session): session opened for user gatekeeper by (uid=0)
Mar 13 11:01:20 gateway sshd[7669]: User child is on pid 7739
Mar 13 11:03:27 gateway sshd[7739]: error: connect_to <node-ip> port 22: failed.
Mar 13 11:03:28 gateway sshd[7739]: Connection closed by <laptop-out-ip>
Mar 13 11:03:28 gateway sshd[7739]: Transferred: sent 2252, received 2864 bytes
Mar 13 11:03:28 gateway sshd[7739]: Closing connection to <laptop-out-ip> port 62113
Mar 13 11:03:28 gateway sshd[7669]: pam_unix(sshd:session): session closed for user gatekeeper

On the node side there is no entry in the logs.

The ssd_config at the gateway:

# ssh service configuration

AcceptEnv
AddressFamily inet
AllowAgentForwarding yes
AllowGroups
AllowTcpForwarding no
AllowUsers gatekeeper
AuthorizedKeysFile %h/.ssh/authorized_keys
ChallengeResponseAuthentication no
Ciphers aes128-ctr,aes192-ctr,aes256-ctr
ClientAliveCountMax 3
ClientAliveInterval 15
Compression delayed
DenyGroups
DenyUsers
GSSAPIAuthentication no
GatewayPorts no
HostKey /etc/ssh/ssh_host_dsa_key
HostKey /etc/ssh/ssh_host_rsa_key
HostKey /etc/ssh/ssh_host_ecdsa_key
HostbasedAuthentication no
KerberosAuthentication no
ListenAddress 0.0.0.0:22
LogLevel VERBOSE
LoginGraceTime 60
MaxAuthTries 6
MaxSessions 10
MaxStartups 30
PasswordAuthentication no
PermitBlacklistedKeys no
PermitRootLogin no
PermitTunnel no
PermitUserEnvironment no
PidFile /var/run/sshd.pid
PrintLastLog yes
PrintMotd no
Protocol 2
PubkeyAuthentication yes
RSAAuthentication no
RhostsRSAAuthentication no
StrictModes yes
SyslogFacility AUTH
TCPKeepAlive yes
UseDNS no
UseLogin no
UsePAM yes
UsePrivilegeSeparation yes
X11Forwarding no

Match User gatekeeper
AllowTcpForwarding yes
AllowAgentForwarding no
X11Forwarding no

Best Answer

Finally I have been able to corner and identify the source of the problem. I could make the problem disappear by not sourcing the iterm2 shell integration, or just updating it to the latest version. This may be somehow related to using the fish shell.

I did not dig further into the problem, if anyone's interested please let me know.