Docker Network – Docker Network Timeouts When Using Bridge

dockerdocker-composelinuxUbuntu

I'm running on a dedicated server ​with Ubuntu version 20.04.3 LTS (kernel 5.4.0-96-generic) and Docker 20.10.7, build 20.10.7-0ubuntu5~20.04.2. The system is a fresh install.

I have a Dockerfile for one of my services, which pulls some libraries in with apt and go get. One of the intermediate containers always fails to connect to the internet with either DNS or TCP Timeout errors. Which one of the containers fails is completely random.

Also note that the problem is not with one specific service, I tried building a completely different service which runs on NodeJS and the npm install failed with the same errors

Today I also had the problem that my Nginx container was not reachable with. All connections to it resulted in timeout errors.

Connections between containers using docker networks also don't work correctly.

Running sudo systemctl restart docker temporarily fixes the problem, but it reappears one or two builds down the line. When I build with the host network instead of the default bridge network, the problem is gone, which is why I suspected a faulty bridge config.

I've tried reinstalling Docker, resetting the iptables and bridge configs, setting different DNS servers, to no avail. The docker log files show no errors.

What could be the cause of this issue?

Update:

I've disabled UFW, but had no success.
This is a dump from my dmesg log during a build that timed out, maybe this helps identify the cause:

[758001.967161] docker0: port 1(vethd0c7887) entered blocking state
[758001.967165] docker0: port 1(vethd0c7887) entered disabled state
[758001.967281] device vethd0c7887 entered promiscuous mode
[758002.000567] IPv6: ADDRCONF(NETDEV_CHANGE): veth7e3840a: link becomes ready
[758002.000621] IPv6: ADDRCONF(NETDEV_CHANGE): vethd0c7887: link becomes ready
[758002.000644] docker0: port 1(vethd0c7887) entered blocking state
[758002.000646] docker0: port 1(vethd0c7887) entered forwarding state
[758002.268554] docker0: port 1(vethd0c7887) entered disabled state
[758002.269581] eth0: renamed from veth7e3840a
[758002.293056] docker0: port 1(vethd0c7887) entered blocking state
[758002.293063] docker0: port 1(vethd0c7887) entered forwarding state
[758041.497891] docker0: port 1(vethd0c7887) entered disabled state
[758041.497997] veth7e3840a: renamed from eth0
[758041.547558] docker0: port 1(vethd0c7887) entered disabled state
[758041.551998] device vethd0c7887 left promiscuous mode
[758041.552008] docker0: port 1(vethd0c7887) entered disabled state

Best Answer

If you have these in dmesg:

[15300.615904] neighbour: arp_cache: neighbor table overflow!

try this:

sudo sysctl -w net.ipv4.neigh.default.gc_thresh3=30000
sudo sysctl -w net.ipv4.neigh.default.gc_thresh2=20000
sudo sysctl -w net.ipv4.neigh.default.gc_thresh1=10000