If I use a manual setup on the command line (following the kernel instructions), I can properly setup my network connection:
# modprobe bonding mode=4 miimon=100
# ifconfig bond0 up
# ip link set eno1 master bond0
# ip link set eno2 master bond0
For the record, the switch used is a Cisco Nexus 2248, and I do not specify an IP address because there's an additional 802.1q layer (whose presence or absence in the configuration file has no impact on the problem).
The problem is that I'm unable to create a correct /etc/network/interfaces
file to have this done automatically at boot time. There is a lot of confusion online between the different versions of the ifenslave package, notably its documentation, and on how to avoid race conditions when using ifup. Whatever worked with the previous versions of Ubuntu does not anymore. And I wouldn't be surprised if systemd were making things even more messy. Basically, whatever I try, my scripts get stuck at boot time and I have to wait either one or five minutes before the boot process completes.
This is the best that I could achieve:
auto lo
iface lo inet loopback
allow-bond0 eno1
iface eno1 inet manual
bond-master bond0
allow-bond0 eno2
iface eno2 inet manual
bond-master bond0
auto bond0
iface bond0 inet manual
bond-mode 4
bond-slaves eno1 eno2
bond-miimon 100
At boot time bringing up bond0 stalls for one minute (because bond0 is waiting for at least one of its slaves to be brought up, that never happens, so it times out), but then once the system is booted, using ifup eno1
works and bond0 starts working properly.
If I specify auto eno1
, then the boot process stalls for five minutes, bond0 is never brought up properly, and trying to use ifdown eno1
will get stuck because it's waiting for some lock in /run/network/wherever
(can't remember the exact file, and have rebooted this machine often enough already), which seems to indicate that yes, I ran into a race condition and ifup is stuck forever with eno1.
Does anyone have a working solution on the latest Ubuntu?
Best Answer
I have a working setup running on 16.04 (linux 4.4.0-22) that is very similar.
Apart from LACP rate and 1G (eno1+) vs 10G SFP+ (eno49+) the biggest difference seems to be the use of
auto bond0
.Some of these options may be redundant.
Not seeing any stalls during boot. Doing a
systemctl restart networking
yields a short wait of a few seconds, but nothing more.