I would suggest the following in addition to the answers already present. Ensure you have some way to restore you're firewall after carefully checking its ruleset.
Disclaimer: if this device is an internet facing machine, this will drop all firewall protection from all interfaces, and could lead to your box getting owned.
# iptables --flush
# iptables -P INPUT ACCEPT
# iptables -P FORWARD ACCEPT
# iptables -P OUTPUT ACCEPT
# /etc/init.d/openssh-server restart
Then retry connection via ssh, if that fails check /var/log/auth.log.
You can also use
# lsof -i TCP:22
to see if the ssh port is opened and what IP address it's listening on.
edit:
re: update, that doesn't appear to be ssh related (it seems to be in relation to sudo privilege elevation.
try tail -f /var/log/auth.log while attempting to connect via ssh.
Connection refused mean that the connection was explicitly rejected by either a firewall or the daemon it's self.
A normal connection would look something like this:
Mar 23 13:32:32 <hostname> sshd[20100]: Accepted password for <user> from xxx.xxx.xxx.xxx port xxxxx ssh2
Mar 23 13:32:32 <hostname> sshd[20102]: (pam_unix) session opened for user <user> by (uid=0)
While an authentication failure will look like this:
Mar 23 13:35:54 <hostname> sshd[20177]: Failed password for <user> from xxx.xxx.xxx.xxx port xxxxx ssh2
If it were blocked by sshd for some reason, that will be eluded to in the auth log, if it were blocked by the firewall (note the firewall may be on the host, client or somewhere in between) you'll see nothing.
Get back to us if that's the case, from there it'll be tcp dump on the client, server and any intermediary routers.
The error seems to be from an 'invalid opcode' of 0000 which means the kernel tried to jump to execute memory that was zero'd. You kernel is tainted with 'D' so this means you've had oopses recently, you should look for these in /var/log/messages
and elsewhere as they may be indicators the actual root cause.
I would say that you are likely right about it happening on high IO as it happened while the rsync process was on the CPU. That doesn't make it certain to be IO related, but it does raise eyebrows.
Given the age of the machine you could be facing a hardware problem. Have you tried running memtest on the machine? Or running an IO benchmark (like bonnie++) to see if that causes the crashes to happen?
Best Answer
Whether you fix kernel logging past boot time or not will not help with kernel panic messages. When your kernel panics, it stops scheduling, so your logging daemon won't ever get to write down the kernel messages. If you want to grab those, you can look into
kdump
to get complete kernel core dumps and/or thenetconsole
kernel module to send the kernel messages over UDP to a remotesyslog
server.As to getting kernel messages into
/var/log/dmesg
past boot time but outside of serious kernel crashes, try to have something like this in/etc/syslog.conf
(or/etc/rsyslog.conf
if usingrsyslog
):For
rsyslog
, the file must also contain:Let me know if you're using
syslog-ng
, it'd be a bit trickier to cover.