SSH Hangs. error: openpty: No such file or directory error: session_pty_req: session 0 alloc failed

sshubuntu-14.04

One of our Ubuntu 14.04 production servers stopped accepting SSH connections. When we try to login we get the SSH Banner text, but then it just hangs. If we login using the management console, we can see the following error messages in /var/log/auth.log

Oct  4 17:37:20 servername sshd[10975]: error: Could not load host key: /etc/ssh/ssh_host_ed25519_key
Oct  4 17:37:21 servername sshd[10975]: Accepted publickey for username from 10.0.0.1 port 57230 ssh2: RSA xx:xx:xx:xx
Oct  4 17:37:21 servername sshd[10975]: pam_unix(sshd:session): session opened for user username by (uid=0)
Oct  4 17:37:25 servername sshd[10975]: error: openpty: No such file or directory
Oct  4 17:37:25 servername sshd[6869]: error: session_pty_req: session 0 alloc failed

Using cat /proc/mounts| grep devpts; ls -hal /dev/{pts,ptmx} I can verify it exists and has the correct permissions, and that there aren't any disk/inode issues:

devpts /dev/pts devpts rw,nosuid,noexec,relatime,mode=600,ptmxmode=000 0 0

crw-rw-rw- 1 root tty  5, 2 Oct  4 17:01 /dev/ptmx

/dev/pts:
total 0
drwxr-xr-x  2 root root       0 Aug 14 00:52 .
drwxr-xr-x 17 root root    4.3K Oct  4 17:01 ..
crw--w----  1 root tty  136, 18 Oct  4 17:41 18
crw--w----  1 root tty  136, 24 Oct  1 13:57 24
crw--w----  1 root tty  136,  3 Oct  4 17:39 3
crw--w----  1 root tty  136, 30 Oct  4 11:29 30
c---------  1 root root   5,  2 Aug 14 00:52 ptmx

df -h
    Filesystem      Size  Used Avail Use% Mounted on
    udev            252G  4.0K  252G   1% /dev
    tmpfs            51G   53M   51G   1% /run
    /dev/sdi2       220G   13G  197G   6% /
    none            4.0K     0  4.0K   0% /sys/fs/cgroup
    none            5.0M     0  5.0M   0% /run/lock
    none            252G   12K  252G   1% /run/shm
    none            100M     0  100M   0% /run/user
    /dev/sdi1        75M   512   75M   1% /boot/efi
    /dev/md1        3.5T  282G  3.0T   9% /ssd

df -hi
    Filesystem     Inodes IUsed IFree IUse% Mounted on
    udev              63M   526   63M    1% /dev
    tmpfs             63M   725   63M    1% /run
    /dev/sdi2         14M  171K   14M    2% /
    none              63M     2   63M    1% /sys/fs/cgroup
    none              63M     1   63M    1% /run/lock
    none              63M     4   63M    1% /run/shm
    none              63M     4   63M    1% /run/user
    /dev/sdi1           0     0     0     - /boot/efi
    /dev/md1         224M    46  224M    1% /ssd

I also verified the sshd_config matches another server and have restarted the ssh service. I believe the devpty system is mounted on startup, but is there any way to resolve the issue without restarting the server?

I see https://access.redhat.com/solutions/67972 has a unverified solution for this issue on RedHat, but I don't have access to a RedHat Subscription.

Best Answer

I found I could get a non-tty based ssh session to work using:

$ ssh username@servername /bin/bash -i

bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell
username@servername:~$ 

I think in this case the ioctl error is expected, because I am starting an interactive session on something that doesn't have a tty. Lots of things have issues in this session (TERM env var isn't even set), but I was able to do some basic troubleshooting and found this:

#View a process list with parent process details
ps -axfo pid,uname,cmd | grep badservice | wc -l
27917

Basically we found one of our services had over 27900 processes running under their username, when we compared this with the good server

$ salt 'server*' cmd.run 'ps -aux | grep badservice | wc -l'
server.good:
    3
server.bad:
    27918

Likely this was causing some sort of resource exhaustion related to ptys. The bad service was stopped, and I killed any remaining processes for that user using sudo kill -u badservice. After which, SSH started working as expected again!