SSH hangs when executing command remotely

ssh

  • Client : OpenSSH_5.1p1 Debian-5ubuntu1 (Ubuntu 9.04)
  • Server : OpenSSH_5.1p1 Debian-5 (Proxmox 2.6.24-7-pve)

I use SSH to execute commands remotely on the server (module check_by_ssh of Nagios). But SSH hangs from time to time when trying to execute commands. I can log to the server via SSH but not executing a simple 'ls'. And it seems to block from all clients from the same IP address.
Authentication is not the problem, may it be made by SSH keys or password.

ssh -l root -p 2222 server.domain.tld 'ls'

Here the client debug info

debug1: Entering interactive session.
debug2: callback start
debug2: client_session2_setup: id 0
debug1: Sending environment.
debug3: Ignored env ORBIT_SOCKETDIR
*** skipping approx 40 env var ignored
debug1: Sending command: ls
debug2: channel 0: request exec confirm 1

It hangs there. Then after a random time, it works again (without doing anything). Killing all sshd process on the server seems to work too. It works from a Putty. I saw that some people had trouble like this due to ISP reverse DNS problem, but it does not seem to be the case here.

It can work for hours and then not work for half an hour or so.

What could explain this behaviour ?

EDIT :
Seems that with -t or -T option, ssh does not hang, but I can't pass one of these options in the check_by_ssh of nagios

Best Answer

I had the same problem, and today finally discovered what was causing the issue (for me at least). This might help you too.

When ssh is setting up a session, the DSCP flags field in the IP header is set to 0x0. If you establish an interactive session, it is set to 0x10 (16), and if you establish a non-interactive session, it is set to 0x8 (8). The ssh client sets the DSCP field with the setsockopt() system call (which I verified in the source)

A faulty configuration on a VPN at my employer was dropping the packets with the DSCP of 0x8, causing all non-interactive ssh traffic to also get dropped. To verify it was the DSCP field that was causing the drop, I used iptables on the ssh server to force the DSCP field to be set to 0x16 and tested my non-interactive traffic (ssh ls, same thing you were trying) and it worked after that. You might also try the same thing and see if thats why your session is hanging.

To set DSCP to 0x10 on all outgoing ssh traffic from your ssh server, run:

$ sudo iptables -t mangle -A OUTPUT -p tcp --sport 22 -j DSCP --set-dscp 0x19

This was on a rhel 6.5 box.