Intermitent connection issues inside LXC

bridgelxcnetworking

I am experiencing connection issues inside a LXC that are driving me mad. They are intermitent. They appear during some time, and they suddenly disapear.

Scenario

A lxc inside a host. Both are running Debian GNU/Linux 8.3
In the lxc there is an installation of Piwik (open source PHP software for stats, with apache, mysql) and an ssh server. The lxc apache is reachable through an nginx proxy in the host

The lxc config:

lxc.tty = 6
lxc.pts = 1024
lxc.rootfs = /var/lib/lxc/hammond/rootfs
lxc.cgroup.devices.deny = a
# /dev/null and zero
lxc.cgroup.devices.allow = c 1:3 rwm
lxc.cgroup.devices.allow = c 1:5 rwm
# consoles
lxc.cgroup.devices.allow = c 5:1 rwm
lxc.cgroup.devices.allow = c 5:0 rwm
lxc.cgroup.devices.allow = c 4:0 rwm
lxc.cgroup.devices.allow = c 4:1 rwm
# /dev/{,u}random
lxc.cgroup.devices.allow = c 1:9 rwm
lxc.cgroup.devices.allow = c 1:8 rwm
lxc.cgroup.devices.allow = c 136:* rwm
lxc.cgroup.devices.allow = c 5:2 rwm
# rtc
lxc.cgroup.devices.allow = c 254:0 rwm

# mounts point
lxc.mount.entry=proc /var/lib/lxc/hammond/rootfs/proc proc nodev,noexec,nosuid 0 0
lxc.mount.entry=devpts /var/lib/lxc/hammond/rootfs/dev/pts devpts defaults 0 0
lxc.mount.entry=sysfs /var/lib/lxc/hammond/rootfs/sys sysfs defaults  0 0

# networking
lxc.utsname = hammond
lxc.network.type = veth
#lxc.network.macvlan.mode = private
lxc.network.flags = up
lxc.network.link = br-hammond
lxc.network.ipv4 = 192.168.100.2/24
lxc.network.ipv4.gateway = 192.168.100.1
lxc.network.hwaddr = 00:1E:10:C1:6B:C9

lxc.start.auto = 1

# http://serverfault.com/questions/658052/systemd-journal-in-debian-jessie-lxc-container-eats-100-cpu
lxc.autodev = 1
lxc.kmsg = 0

Issues:

1. Cannot connect to local database

Suddenly, Piwik reports:

SQLSTATE[HY000] [2003] Can't connect to MySQL server on '127.0.0.1' (111)

The database is running, of course.

  • If I telnet from inside the lxc (127.0.0.1:3306), I can connect to the database
  • If I telnet the apache from inside the lxc (127.0.0.1:80), Piwik works fine. It connects to the database, renders the page as usual and doesn't report any error.
  • If I telnet the apache from the host (192.168.100.2:80), Piwik reports the database error.

2. SSH freezes

I am tunneling the ssh connection to the lxc using ProxyCommand

ProxyCommand ssh -q host nc -q0 192.168.100.2 22

After the ssh negotiation phase, the connection freezes. If I type keys, they don't show up in the console. Finally, the connection timeouts with

packet_write_wait: Connection to UNKNOWN: Broken pipe

I have sniffed the packets with tcpdump and ssh key exchanges goes fine. Then, the traffic stops after 0.5 seconds

I think this is a bug in last Debian kernel updates. It used to work fine, but I am experiencing these problems since a few weeks ago. As I mention, they are intermitent. Suddenly, everything goes fine.

Suggestions on how to investigate further are welcomed

Best Answer

I've had a problem with the same symptoms. In my case, there was another host with the same IP on the vlan I used in the bridge. Sometimes the other host would be faster to answer to the ARP request (despite being another physical machine), at which point the lxc guest would save the wrong MAC address in its ARP table and continue sending ethernet frames to the wrong address until another ARP request "resolved" the problem.

I diagnosed this with a timestamped ping from host to guest:

# ping -n 10.70.70.10 | perl -nle 'BEGIN {$|++} print scalar(localtime), " ", $_' |tee -a ping10707010.log
[...]
Sun Jul 31 09:18:53 2016 64 bytes from 10.70.70.10: icmp_seq=3389 ttl=64 time=0.035 ms
Sun Jul 31 09:18:54 2016 64 bytes from 10.70.70.10: icmp_seq=3390 ttl=64 time=0.035 ms
Sun Jul 31 09:18:55 2016 64 bytes from 10.70.70.10: icmp_seq=3391 ttl=64 time=0.027 ms
Sun Jul 31 09:19:45 2016 64 bytes from 10.70.70.10: icmp_seq=3441 ttl=64 time=0.064 ms
Sun Jul 31 09:19:46 2016 64 bytes from 10.70.70.10: icmp_seq=3442 ttl=64 time=0.038 ms
Sun Jul 31 09:19:47 2016 64 bytes from 10.70.70.10: icmp_seq=3443 ttl=64 time=0.036 ms

as well as a tcpdump on both host and guest:

# tcpdump -n -i brv3001 # on the host
[...]
09:18:55.724751 IP 10.70.0.1 > 10.70.70.10: ICMP echo request, id 26519, seq 3391, length 64
09:18:55.724768 IP 10.70.70.10 > 10.70.0.1: ICMP echo reply, id 26519, seq 3391, length 64
09:18:56.336109 ARP, Request who-has 10.70.70.10 tell 10.70.0.1, length 42
09:18:56.336147 ARP, Reply 10.70.70.10 is-at 00:16:3e:46:46:0a, length 28
[...]
09:19:44.728738 ARP, Request who-has 10.70.70.10 tell 10.70.0.1, length 28
09:19:44.728769 ARP, Reply 10.70.70.10 is-at 00:16:3e:46:46:0a, length 28
# tcpdump -n -i infra0 # on the guest
[...]
09:18:55.724757 IP 10.70.0.1 > 10.70.70.10: ICMP echo request, id 26519, seq 3391, length 64
09:18:55.724767 IP 10.70.70.10 > 10.70.0.1: ICMP echo reply, id 26519, seq 3391, length 64
09:18:56.336123 ARP, Request who-has 10.70.70.10 tell 10.70.0.1, length 42
09:18:56.336144 ARP, Reply 10.70.70.10 is-at 00:16:3e:46:46:0a, length 28
[...]
09:19:44.728745 ARP, Request who-has 10.70.70.10 tell 10.70.0.1, length 28
09:19:44.728766 ARP, Reply 10.70.70.10 is-at 00:16:3e:46:46:0a, length 28

which allowed me to see that around the point when the network would drop out and when it would reactivate, ARP requests were being issued and answered. The ARP requests seemed to be in order (using the correct MACs), but i decided to check the facts as seen by the OS anyway, so I logged ARP tables on host and guest with timestamps:

# while true; do date; arp -n; sleep 1; done > arp.log 2>&1 # on the host
[...]
Sun Jul 31 09:18:55 CEST 2016
Address                  HWtype  HWaddress           Flags Mask            Iface
10.70.70.10              ether   00:16:3e:46:46:0a   C                     brv3001
Sun Jul 31 09:18:56 CEST 2016
Address                  HWtype  HWaddress           Flags Mask            Iface
10.70.70.10              ether   00:16:3e:46:46:0a   C                     brv3001
# while true; do date; arp -n; sleep 1; done > arp.log 2>&1 # on the guest
Sun Jul 31 09:18:55 CEST 2016
Address                  HWtype  HWaddress           Flags Mask            Iface
10.70.0.1                ether   00:1e:68:4a:03:b0   C                     infra0
Sun Jul 31 09:18:56 CEST 2016
Address                  HWtype  HWaddress           Flags Mask            Iface
10.70.0.1                ether   c4:34:6b:22:b6:7c   C                     infra0

which allowed me to understand that the host did not have a faulty MAC of the guest, but the guest somehow arrived at a faulty MAC of the host. Irritatingly, that was not reflected in the tcpdump information. (NB: there may be a race condition somewhere in libpcap or the ip stack that would benefit from investigating)

After finding the erroneous MAC, I looked up which vendor the erroneous MAC address belonged to, and thus was able to find the offending machine. If that information had been more ambiguous, I'm sure my switch would've had functionality to help me find the right switch port.

I suppose that up/downgrading kernels and certain userland tools would change and maybe even remove all or some of the symptoms through changed timings, slightly different behavior, other network services being active etc. For example, a ping from guest to host would reliably "fix" the problem in my case.

Also, do not forget that the IP addresses you can see with ifconfig are not all of the IP addresses used by the system. ip addr ls will be more comprehensive on linux and maybe even some more advanced iptables configurations could play a role too. If you are in bad luck, the host responding to the arps may even have a broken IP stack. You may even get ARP replies from other customers of your ISP if your network isn't properly isolated.

I realize that this might not be the exact solution to your problem, but I thought I'd leave some pointers for debugging for the next person to look for and find this issue on serverfault.