SSH connection lost after random time interval, both sides claim the other closed the connection

solarissshvirtual-machines

I'm connecting to a virtual solaris machine from the windows host system running virtual box. This works for a while but after some time the connection vanishes.

The stange thing is that sshd claims the connection is reset by peer, while the ssh session says the connection is closed by remote host.

I have managed to start the sshd manually (/usr/lib/ssh/sshd -d), so that I get the debug output shown below, but I'm completely at a loss how to proceed.

Things tried so far:

  • Check /var/log/authlog: it's empty
  • Check packages are up to date (pkgchk -n SUNWsshcu, pkgchk -n SUNWsshdr, pkgchk -n SUNWsshdu, pkgchk -n SUNWsshhr, pkgchk -n SUNWsshr, pkgchk -n SUNWsshu): all up to date
  • Allow password login in /etc/ssh/ssh_config PasswordAuthentication yes and use that: no change

Question: I'm stuck, how can I continue working on the problem?


More information:

Starting up ssh daemon:

bash-3.2# /usr/lib/ssh/sshd -d
debug1: sshd version Sun_SSH_1.1.5
debug1: read PEM private key done: type RSA
debug1: private host key: #0 type 1 RSA
debug1: read PEM private key done: type DSA
debug1: private host key: #1 type 2 DSA
debug1: Bind to port 22 on ::.
Server listening on :: port 22.

Connecting from remote:

debug1: Server will not fork when running in debugging mode.
Connection from 10.0.2.2 port 26688
debug1: Client protocol version 2.0; client software version OpenSSH_6.2
debug1: match: OpenSSH_6.2 pat OpenSSH*
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-Sun_SSH_1.1.5
monitor debug1: list_hostkey_types: ssh-rsa,ssh-dss
debug1: use_engine is 'yes'
monitor debug1: reading the context from the child
debug1: pkcs11 engine initialized, now setting it as default for RSA, DSA, and symmetric ciphers
debug1: pkcs11 engine initialization complete
debug1: list_hostkey_types: ssh-rsa,ssh-dss
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: client->server aes128-ctr hmac-md5 zlib
debug1: kex: server->client aes128-ctr hmac-md5 zlib
debug1: Peer sent proposed langtags, ctos:
debug1: Peer sent proposed langtags, stoc:
debug1: We proposed langtags, ctos: i-default
debug1: We proposed langtags, stoc: i-default
debug1: SSH2_MSG_KEX_DH_GEX_REQUEST received
debug1: SSH2_MSG_KEX_DH_GEX_GROUP sent
debug1: dh_gen_key: priv key bits set: 134/256
debug1: bits set: 526/1024
debug1: expecting SSH2_MSG_KEX_DH_GEX_INIT
debug1: bits set: 497/1024
debug1: SSH2_MSG_KEX_DH_GEX_REPLY sent
debug1: newkeys: mode 1
debug1: set_newkeys: setting new keys for 'out' mode
debug1: Enabling compression at level 6.
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug1: newkeys: mode 0
debug1: set_newkeys: setting new keys for 'in' mode
debug1: SSH2_MSG_NEWKEYS received
debug1: KEX done
debug1: userauth-request for user beginner service ssh-connection method none
debug1: attempt 0 initial attempt 0 failures 0 initial failures 0
Failed none for beginn from 10.0.2.2 port 26688 ssh2
debug1: userauth-request for user beginner service ssh-connection method passworddebug1: attempt 1 initial attempt 0 failures 1 initial failures 0
Accepted password for beginner from 10.0.2.2 port 26688 ssh2
debug1: permanently_set_uid: 54324/1
debug1: sending auth context to the monitor
debug1: will send 41 bytes of auth context to the monitor
monitor debug1: finished reading the context
monitor debug1: use_engine is 'yes'
monitor debug1: pkcs11 engine initialized, now setting it as default for RSA, DSA, and symmetric ciphers
monitor debug1: pkcs11 engine initialization complete
monitor debug1: Entering monitor loop.
monitor debug1: fd 9 setting O_NONBLOCK
monitor debug1: fd 10 setting O_NONBLOCK
debug1: Entering interactive session for SSH2.
debug1: fd 9 setting O_NONBLOCK
debug1: fd 10 setting O_NONBLOCK
debug1: server_init_dispatch_20
debug1: server_input_channel_open: ctype session rchan 0 win 1048576 max 16384
debug1: input_session_request
debug1: channel 0: new [server-session]
debug1: session_new: init
debug1: session_new: session 0
debug1: session_open: channel 0
debug1: session_open: session 0: link with channel 0
debug1: server_input_channel_open: confirm session
debug1: server_input_channel_req: channel 0 request x11-req reply 1
debug1: session_by_channel: session 0 channel 0
debug1: session_input_channel_req: session 0 req x11-req
debug1: bind port 6010: Address already in use; skipping this port
debug1: bind port 6011: Address already in use; skipping this port
debug1: bind port 6012: Address already in use; skipping this port
debug1: bind port 6013: Address already in use; skipping this port
debug1: fd 11 setting O_NONBLOCK
debug1: channel 1: new [X11 inet listener]
debug1: server_input_channel_req: channel 0 request pty-req reply 1
debug1: session_by_channel: session 0 channel 0
debug1: session_input_channel_req: session 0 req pty-req
debug1: Allocating pty.
debug1: session_pty_req: session 0 alloc /dev/pts/8
debug1: server_input_channel_req: channel 0 request shell reply 1
debug1: session_by_channel: session 0 channel 0
debug1: session_input_channel_req: session 0 req shell
debug1: Setting controlling tty using TIOCSCTTY.
debug1: fd 4 setting TCP_NODELAY
debug1: SSH receive window size: 198560 B
debug1: fd 13 setting O_NONBLOCK

Starting emacs and working for some time:

debug1: server_input_global_request: rtype keepalive@openssh.com want_reply 1
debug1: server_input_global_request: rtype keepalive@openssh.com want_reply 1
debug1: X11 connection requested.
debug1: fd 16 setting TCP_NODELAY
debug1: channel 2: new [X11 connection from 127.0.0.1 port 33079]
debug1: channel 2: open confirm rwindow 2097152 rmax 16384
debug1: channel 2: read<=0 rfd 16 len 0
debug1: channel 2: read failed
debug1: channel 2: close_read
debug1: channel 2: input open -> drain
debug1: channel 2: ibuf empty
debug1: channel 2: send eof
debug1: channel 2: input drain -> closed
debug1: channel 2: rcvd eof
debug1: channel 2: output open -> drain
debug1: channel 2: obuf empty
debug1: channel 2: close_write
debug1: channel 2: output drain -> closed
debug1: channel 2: rcvd close
debug1: channel 2: send close
debug1: channel 2: is dead
debug1: channel 2: garbage collecting
debug1: channel_free: channel 2: X11 connection from 127.0.0.1 port 33079, nchan nels 3
debug1: X11 connection requested.
debug1: fd 16 setting TCP_NODELAY
debug1: channel 2: new [X11 connection from 127.0.0.1 port 33080]
debug1: channel 2: open confirm rwindow 2097152 rmax 16384
debug1: channel 2: read<=0 rfd 16 len 0
debug1: channel 2: read failed
debug1: channel 2: close_read
debug1: channel 2: input open -> drain
debug1: channel 2: ibuf empty
debug1: channel 2: send eof
debug1: channel 2: input drain -> closed
debug1: X11 connection requested.
debug1: fd 17 setting TCP_NODELAY
debug1: channel 3: new [X11 connection from 127.0.0.1 port 33081]
debug1: channel 2: rcvd eof
debug1: channel 2: output open -> drain
debug1: channel 2: obuf empty
debug1: channel 2: close_write
debug1: channel 2: output drain -> closed
debug1: channel 2: rcvd close
debug1: channel 2: send close
debug1: channel 2: is dead
debug1: channel 2: garbage collecting
debug1: channel_free: channel 2: X11 connection from 127.0.0.1 port 33080, nchan nels 4
debug1: channel 3: open confirm rwindow 2097152 rmax 16384
debug1: channel 3: read<=0 rfd 17 len 0
debug1: channel 3: read failed
debug1: channel 3: close_read
debug1: channel 3: input open -> drain
debug1: channel 3: ibuf empty
debug1: channel 3: send eof
debug1: channel 3: input drain -> closed
debug1: channel 3: rcvd eof
debug1: channel 3: output open -> drain
debug1: channel 3: obuf empty
debug1: channel 3: close_write
debug1: channel 3: output drain -> closed
debug1: channel 3: send close
debug1: channel 3: rcvd close
debug1: channel 3: is dead
debug1: channel 3: garbage collecting
debug1: channel_free: channel 3: X11 connection from 127.0.0.1 port 33081, nchan nels 3
debug1: X11 connection requested.
debug1: fd 16 setting TCP_NODELAY
debug1: channel 2: new [X11 connection from 127.0.0.1 port 33084]
debug1: channel 2: open confirm rwindow 2097152 rmax 16384
debug1: X11 connection requested.
debug1: fd 17 setting TCP_NODELAY
debug1: channel 3: new [X11 connection from 127.0.0.1 port 33085]
debug1: channel 2: read<=0 rfd 16 len 0
debug1: channel 2: read failed
debug1: channel 2: close_read
debug1: channel 2: input open -> drain
debug1: channel 2: ibuf empty
debug1: channel 2: send eof
debug1: channel 2: input drain -> closed
debug1: channel 3: open confirm rwindow 2097152 rmax 16384
debug1: channel 2: rcvd eof
debug1: channel 2: output open -> drain
debug1: channel 2: obuf empty
debug1: channel 2: close_write
debug1: channel 2: output drain -> closed
debug1: channel 2: rcvd close
debug1: channel 2: send close
debug1: channel 2: is dead
debug1: channel 2: garbage collecting
debug1: channel_free: channel 2: X11 connection from 127.0.0.1 port 33084, nchan nels 4
debug1: X11 connection requested.
debug1: fd 16 setting TCP_NODELAY
debug1: channel 2: new [X11 connection from 127.0.0.1 port 33086]
debug1: channel 3: read<=0 rfd 17 len 0
debug1: channel 3: read failed
debug1: channel 3: close_read
debug1: channel 3: input open -> drain
debug1: channel 3: ibuf empty
debug1: channel 3: send eof
debug1: channel 3: input drain -> closed
debug1: channel 2: open confirm rwindow 2097152 rmax 16384
debug1: channel 3: rcvd eof
debug1: channel 3: output open -> drain
debug1: channel 3: obuf empty
debug1: channel 3: close_write
debug1: channel 3: output drain -> closed
debug1: channel 3: rcvd close
debug1: channel 3: send close
debug1: channel 3: is dead
debug1: channel 3: garbage collecting
debug1: channel_free: channel 3: X11 connection from 127.0.0.1 port 33085, nchan nels 4

After some random time interval: the connection is lost:

Read error from remote host 10.0.2.2: Connection reset by peer
debug1: Calling cleanup 0x806d882(0x80afd90)
debug1: session_pty_cleanup: session 0 release /dev/pts/8
debug1: Calling cleanup 0x80729a7(0x0)
debug1: channel_free: channel 0: server-session, nchannels 3
debug1: channel_free: channel 1: X11 inet listener, nchannels 2
debug1: channel_free: channel 2: X11 connection from 127.0.0.1 port 33086, nchannels 1
debug1: Calling cleanup 0x8064fe7(0x80c1318)
debug1: Calling cleanup 0x807e79a(0x0)
debug1: compress outgoing: raw data 36410262, compressed 3980612, factor 0.11
debug1: compress incoming: raw data 18374832, compressed 674656, factor 0.04
monitor debug1: Monitor received SIGCHLD.

Output of incoming ssh:

~> ssh beginner@127.0.0.1 -p 2222
Connection to 127.0.0.1 closed by remote host.
Connection to 127.0.0.1 closed

Best Answer

Is there a NAT router between the two machines? It may be closing the connection do to inactivity and timeouts?

The SSH client can turn on SSH-level KeepAlive to try to avoid this scenario.

For openssh client, we include the following in the client-side config file (either /etc/ssh/ssh_config or ~/.ssh/config):

KeepAlive yes