SSH public key login fails without pattern

ecryptfspublic-keysshubuntu-14.04

(previously posted at stackoverflow by error)

I'm running a bunch of servers with Ubuntu 14.04.1 (sun,hyperion,…) all of which use public keys (OpenSSH_6.6.1, OpenSSL 1.0.1f 6 Jan 2014 on all machines) for rsync without problems. Almost all…

One connection fails without any changes in the configs or keys. Then I'll try to re-add the keys, check for ECDSA, reboot/restart ssh and it works again. Or it doesn't. In this case I just wait a random amount of time (1h up to 3 months) an do the same. This time it fixes the problem – for a while.

The relevant parts of a ssh -vvv diff:

Successful connection

debug1: Host 'hyperion.internal' is known and matches the ECDSA host key.
debug1: Found key in /home/bar/.ssh/known_hosts:20
debug1: ssh_ecdsa_verify: signature correct
debug2: kex_derive_keys
debug2: set_newkeys: mode 1
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug2: set_newkeys: mode 0
debug1: SSH2_MSG_NEWKEYS received
debug1: Roaming not allowed by server
debug1: SSH2_MSG_SERVICE_REQUEST sent
debug2: service_accept: ssh-userauth
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug2: key: /home/bar/.ssh/id_rsa (0x7f..),
debug2: key: /home/bar/.ssh/id_dsa ((nil)),
debug2: key: /home/bar/.ssh/id_ecdsa ((nil)),
debug2: key: /home/bar/.ssh/id_ed25519 ((nil)),
debug1: Authentications that can continue: publickey,password
debug3: start over, passed a different list publickey,password
debug3: preferred gssapi-keyex,gssapi-with-mic,publickey,keyboard-interactive,password
debug3: authmethod_lookup publickey
debug3: remaining preferred: keyboard-interactive,password
debug3: authmethod_is_enabled publickey
debug1: Next authentication method: publickey
debug1: Offering RSA public key: /home/bar/.ssh/id_rsa
debug3: send_pubkey_test
debug2: we sent a publickey packet, wait for reply
debug1: Server accepts key: pkalg ssh-rsa blen 279
debug2: input_userauth_pk_ok: fp 95:...
debug3: sign_and_send_pubkey: RSA 95:...
debug1: key_parse_private2: missing begin marker
debug1: read PEM private key done: type RSA
debug1: Authentication succeeded (publickey).
Authenticated to hyperion.internal ([172.16.0.10]:22).

Failed connection

debug1: Host 'hyperion.internal' is known and matches the ECDSA host key.
debug1: Found key in /home/bar/.ssh/known_hosts:20
debug1: ssh_ecdsa_verify: signature correct
debug2: kex_derive_keys
debug2: set_newkeys: mode 1
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug2: set_newkeys: mode 0
debug1: SSH2_MSG_NEWKEYS received
debug1: Roaming not allowed by server
debug1: SSH2_MSG_SERVICE_REQUEST sent
debug2: service_accept: ssh-userauth
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug2: key: /home/bar/.ssh/id_rsa (0x7f..),
debug2: key: /home/bar/.ssh/id_dsa ((nil)),
debug2: key: /home/bar/.ssh/id_ecdsa ((nil)),
debug2: key: /home/bar/.ssh/id_ed25519 ((nil)),
debug1: Authentications that can continue: publickey,password
debug3: start over, passed a different list publickey,password
debug3: preferred gssapi-keyex,gssapi-with-mic,publickey,keyboard-interactive,password
debug3: authmethod_lookup publickey
debug3: remaining preferred: keyboard-interactive,password
debug3: authmethod_is_enabled publickey
debug1: Next authentication method: publickey
debug1: Offering RSA public key: /home/bar/.ssh/id_rsa
debug3: send_pubkey_test
debug2: we sent a publickey packet, wait for reply
debug1: Authentications that can continue: publickey,password
debug1: Trying private key: /home/bar/.ssh/id_dsa
debug3: no such identity: /home/bar/.ssh/id_dsa: No such file or directory
debug1: Trying private key: /home/bar/.ssh/id_ecdsa
debug3: no such identity: /home/bar/.ssh/id_ecdsa: No such file or directory
debug1: Trying private key: /home/bar/.ssh/id_ed25519
debug3: no such identity: /home/bar/.ssh/id_ed25519: No such file or directory
debug2: we did not send a packet, disable method
debug3: authmethod_lookup password
debug3: remaining preferred: ,password
debug3: authmethod_is_enabled password
debug1: Next authentication method: password

Things that I've checked several times:

  • permissions for .ssh/ and id_rsa on all machines
  • that I'm using the right keys
  • that ssh-copy-id -i /home/bar/.ssh/id_rsa europa@hyperion.internal copies the right keys to the right authorized_hosts file

What didn't really help but added to the vodoo/heisenbug effect:

  • rebooting the machines
  • restarting the ssh service
  • fiddling with global ssh options

I've pasted the full logs with some redacted info at pastebin: wall of log

Best Answer

The problem has been resolved, it wasn't ssh-related at all:

hyperion.internal had an encrypted home, so the key-lookup failed when it was not mounted to /home/europe.

In hindsight quite obvious, but it accounts for the heisenbug effect of not failing when observing the logs on the machine (while being logged in, of course...)

Hope this helps at least some else.