Centos – In a LACP bond, do all partner interfaces need to have the same “oper key”

bondingcentoslacp

We use LACP (mode 4) bonds extensively in our environment, and I occasionally run into problems with new deployments where cables get crossed, or switch ports are misconfigured causing bad LACP port states.

One thing that I've been using to troubleshoot is the value of the partner oper key. These generally tend to match, and when they don't, it makes me suspect a possible crossed cable problem. I've been trying to research it, but have been having a hard time finding a definitive answer. So, is it reasonable to expect oper keys accross a LACP channel group to always share the same oper key, or are there cases where they might differ in a correctly configured group?

For example:

# grep -A6 "partner lacp pdu" /proc/net/bonding/bond0
details partner lacp pdu:
    system priority: 32768
    system mac address: 70:e4:23:92:42:b7
    oper key: 205
    port priority: 32768
    port number: 92
    port state: 61
--
details partner lacp pdu:
    system priority: 32768
    system mac address: 70:e4:23:92:42:b7
    oper key: 206
    port priority: 32768
    port number: 94
    port state: 13

In this example, I know the state of the 2nd partner is bad – I'm just trying to to come up with a good way of determining "why" it's bad.

Best Answer

I just logged into 400 servers all using LACP mode 4. Two interfaces, 25G up/down for 50G total. 2x Cisco 9600 LACP mode 4 set in a port channel to combine both ports. One cable goes into a different switch to have, power, switch, cable, rack and interface redundancy.

oper key is the same across the board.

I am including a working bond below.

Several things come to mind for your question,

One thing that I've been using to troubleshoot is the value of the partner oper key. These generally tend to match, and when they don't, it makes me suspect a possible crossed cable problem

This could be addressed by using a standard cabling practice. All of our cables that go down the left side of the rack, all plug into the left side of the switch (or in this case, one rack over.) and all the cables run on the right side go to the right side of the switch. So that looks like, server 1 has a cable to port 1 or port 48. This would help you as it creates a standard model to count from. Server 5 would be port 43 and port 5. Easy to track, easy to communicate.

Another thought, we use mac addresses to track down LACP members. I could log into a server using radssh + racadm (out of band access) or radssh (utilizing ssh) to bulk log into all my servers and pull the list of (not bond0, we want the actual members ) mac addresses. Hand that completed list of mac addresses to a network team and compare the list of members to the list of mac addresses.

sudo cat /etc/sysconfig/network-scripts/ifcfg-bond0 
DEVICE=bond0
NAME=bond0
#NM_CONTROLLED=no
IPADDR=$SERVER_IP
PREFIX=22
GATEWAY=$GATEWAY_IP
ONBOOT=yes
BOOTPROTO=none
BONDING_OPTS="miimon=100 mode=4 lacp_rate=1 xmit_hash_policy=layer3+4"

sudo cat /etc/sysconfig/network-scripts/ifcfg-enp10s0f0
DEVICE=enp10s0f0
TYPE="Ethernet"
BOOTPROTO="none"
ONBOOT="yes"
MASTER=bond0
SLAVE=yes
##HWADDR=<MAC>:2C:6C
#DEFROUTE="yes"
#PEERDNS="yes"
#PEERROUTES="yes"
IPV4_FAILURE_FATAL="no"
IPV6_FAILURE_FATAL="no"
NAME="enp10s0f0"

 sudo cat /proc/net/bonding/bond0

  Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

  Bonding Mode: IEEE 802.3ad Dynamic link aggregation
  Transmit Hash Policy: layer3+4 (1)
  MII Status: up
  MII Polling Interval (ms): 100
  Up Delay (ms): 0
  Down Delay (ms): 0

  802.3ad info
  LACP rate: fast
  Min links: 0
  Aggregator selection policy (ad_select): stable
  Active Aggregator Info:
    Aggregator ID: 2
    Number of ports: 2
    Actor Key: 1
    Partner Key: 32875
    Partner Mac Address: <MAC>:be:03

  Slave Interface: enp10s0f0
  MII Status: up
  Speed: 25000 Mbps
  Duplex: full
  Link Failure Count: 7
  Permanent HW addr: <MAC>:ea:7c
  Slave queue ID: 0
  Aggregator ID: 2
  Actor Churn State: none
  Partner Churn State: none
  Actor Churned Count: 6
  Partner Churned Count: 6
  details actor lacp pdu:
      system priority: 65535
      port key: 1
      port priority: 255
      port number: 1
      port state: 63
  details partner lacp pdu:
      system priority: 32667
      oper key: 32875
      port priority: 32768
      port number: 263
      port state: 60

  Slave Interface: p8p2
  MII Status: up
  Speed: 25000 Mbps
  Duplex: full
  Link Failure Count: 7
  Permanent HW addr: <MAC>:ea:7d
  Slave queue ID: 0
  Aggregator ID: 2
  Actor Churn State: none
  Partner Churn State: none
  Actor Churned Count: 5
  Partner Churned Count: 5
  details actor lacp pdu:
      system priority: 65535
      port key: 1
      port priority: 255
      port number: 2
      port state: 63
  details partner lacp pdu:
      system priority: 32667
      oper key: 32875
      port priority: 32768
      port number: 16647
      port state: 60
Related Topic