Linux – How to keep a bridge enabled on a bonded interface

bondingbridgecentos6linuxnetworking

I'm working on setting up a pair of CentOS 6.3 servers that will run a couple of KVM vms and have come across a problem setting up a bridge on a bond.

I am using Mode 4 (802.3ad) bonding on a pair of stacked Dell Powerconnect 5524 switches connecting to R320 servers. There are 2 links (1 to each switch) that form a Link Aggregation Group (802.3ad / LACP bonding). On top of the bond I have VLAN Tagging.

I've verified this is a problem on multiple other bonding modes so it isn't just a mode 4 issue.

I am testing what happens when 1 link is dropped (ie switch dies, cable breaks, etc).

If I don't have a bridge (for KVM), everything works fine, failover happens as expected.

If I have the bridge enabled, it works fine until failover (unplugging a cable). When failover happens /var/log/messages shows the slave link going down, followed within a second by:

kernel: br1: port 1(bond0.8) entering disabled state

The thing is /proc/net/bonding/bond0 shows the link is up as expected (simply with only 1 slave instead of 2). If I plug the cable back in it recovers and brings the bridge back to an enabled state.

I actually have tested this while a ping is occuring and if the timing is right a packet will actually leave the system after the link is lost, but before the disabled message occurs.

This disabled state I assumed was STP, but I have disabled STP on the bridge configuration and this issue still occurs.

brctl showstp br1 

still shows the link as disabled when it is running without a slave.

I also switched between the nics in the server (I have 2x Broadcom & 4x intel). It doesn't matter which configuration I have.

Does anyone know of a way to force the bridge to stay enabled or why its detecting the bond as disabled, when it isn't?

Best Answer

I've run into exactly the same issue with Fedora 16 on top of 2 x Dell R410s and a stucked pair of PowerConnect 6448s.

Bridged interface on top of a 802.3ad bond.

I'm experiencing exactly the same symptoms.

Here are the config files:

cat /etc/modprobe.d/bonding.conf

alias netdev-bond0 bonding

alias netdev-bond1 bonding

alias netdev-bond2 bonding

cat /proc/net/bonding/bond0

Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation

Transmit Hash Policy: layer3+4 (1)

MII Status: up

MII Polling Interval (ms): 100

Up Delay (ms): 0

Down Delay (ms): 0

802.3ad info

LACP rate: fast

Min links: 0

Aggregator selection policy (ad_select): stable

Active Aggregator Info:

Aggregator ID: 23

Number of ports: 2

Actor Key: 17

Partner Key: 629

Partner Mac Address: 00:21:9b:b2:08:40

Slave Interface: em1

MII Status: up

Speed: 1000 Mbps

Duplex: full

Link Failure Count: 0

Permanent HW addr: 00:1e:c9:fd:f1:5e

Aggregator ID: 23

Slave queue ID: 0

Slave Interface: em2

MII Status: up

Speed: 1000 Mbps

Duplex: full

Link Failure Count: 0

Permanent HW addr: 00:1e:c9:fd:f1:60

Aggregator ID: 23

Slave queue ID: 0

cat /etc/sysconfig/network-scripts/ifcfg-br0

DEVICE=br0

ONBOOT=yes

TYPE=Bridge

BOOTPROTO=none

IPADDR=10.100.100.101

NETMASK=255.255.255.0

IPV6INIT=no

IPV6_AUTOCONF=no

DHCPV6=no

IPV6ADDR=fe80::21e:c9ff:fefd:f15e/64

/etc/sysconfig/network-scripts/ifcfg-bond0

DEVICE=bond0

USERCTL=no

BOOTPROTO=none

ONBOOT=yes

BONDING_OPTS="miimon=100 mode=4 lacp_rate=1 xmit_hash_policy=1"

BRIDGE=br0

cat /etc/sysconfig/network-scripts/ifcfg-em1

DEVICE=em1

HWADDR=00:1E:C9:FD:F1:5E

ONBOOT=yes

MASTER=bond0

SLAVE=yes

Related Topic