Linux bonding: 802.3ad (LACP) vs. balance-alb mode

bondinglacplinuxnetworking

Here's the situation. I would like to connect my Linux servers to a single network using dual link for fault tolerance and load balancing reasons. The servers have 2 or more 1-gig NICs and I plan to connect each of them to a different switch that reside in a single stack (i.e. a single virtual switch). All switches are Juniper EX4200 or EX4500.

I know I can use any of the Linux bonding modes and I wonder what is the best one. Historically I used the active-backup mode because some servers were connecting to non-stacked switches but now we have a new and consistent network and I would like to take use a bonding mode that offers load balancing in addition to fault tolerance.

I thought the best mode to use is 802.3ad (LACP) because that's the standard being used on all network equipment, but as it turns out the moment I configure a set of ports as an LACP channel on the switch side the connection breaks until I also configure the server side properly. This makes our system administration tasks much harder because before installing a new server we must remove the LACP configuration on the switch (because things like PXE boot and network installation do not work on LACP ports), and after the installation we need must change the switch settings again but only after the server was configured to use LACP, or the connection will die.

Other bonding modes such as balance-alb do not require any special configuration on the switch side while on paper provide the same advantages.

Is there any reason to choose 802.3ad instead of balance-alb?

Best Answer

I'm not terribly familiar with Juniper switches, but you shouldn't have to configure LACP on them; that is the point of LACP. If this isn't the case, something is wrong with your switch configuration.

LACP only specifies a protocol for dynamically aggregating ports. It does not specify a port scheduling policy (where traffic is sent and received). This policy is set separately. I don't remember the process in Linux, but I know Linux supports specifying at couple different policies, probably similar to balance-alb.

The balance-alb has specific disadvantages. Mainly that it semi-intelligently selects an outgoing port for new connections, and they're stuck to that one port for the life of the connection (it's actually done by MAC, not port, if a port fails the MAC gets assigned to another port, thus allowing the connection to continue).

This doesn't exactly "aggregate" the ports however, as connections will not be able to utilize more than one port. So if you've got 2 1GbE ports, a single connection is still limited to 1GbE. LACP resolves this usually, though it depends on your scheduling policy and the number of active ports supported at each end.