I've recently faced a problem in an HPE Distributed-Trunking Switch-to-Switch Square-Topology.
Image 1 (shows the essential part of the architecture that consists of 4 switches working in pairs, being switch A1 and A2 the first pair, B1 and B2 the second pair. Both pairs connect switches, servers, clients (not fully drawn for ease of reading).
During normal operations A1, A2, B1 and B2 can ping both 192.168.5.47
and 192.168.6.95
that are used here as "test" hosts.
Sometimes it happens that one of the two hosts gets unreachable ONLY FROM A2, like described in Image 2.
Normal operations are restored by issuing a clear mac-address Trk1
on switch A2.
These are some relevant output of A2 during normal operations:
A2# ping 192.168.5.47
192.168.5.47 is alive, time = 3 ms
A2# ping 192.168.6.95
192.168.6.95 is alive. time = 3 ms
A2#show arp
IP ARP table
192.168.5.47 aabbcc-111111 dynamic Trk10
192.168.6.95 aabbcc-222222 dynamic Trk10
A2# show mac-address vlan 288
aabbcc-111111 Trk10
aabbcc-222222 Trk10
After issue occurs these are some relevant output from A2
A2# ping 192.168.5.47
**Request timed out.**
A2# ping 192.168.6.95
192.168.6.95 is alive. time = 3 ms
A2#show arp
192.168.6.95 aabbcc-222222 dynamic Trk10
A2# show mac-address vlan 288
aabbcc-111111 Trk1
aabbcc-222222 Trk10
I've noticed that the erred host (192.168.5.47
) is seen by A2 on Trk1: it should be on Trk10 instead.
The only way to restore normal operation is by issuing clear mac-address Trk1
on switch A2.
Can someone figure out one or more possible reason for this behavior?
Follows relevant configurations of A1, A2, B1 and B2.
A1:
A1# sh run
Running configuration:
; J9850A Configuration Editor; Created on release #KB.16.03.0004
; Ver #10:08.7f.ff.bb.ff.7c.59.fc.7b.ff.ff.fc.ff.ff.3f.ef:52
hostname "A1"
module A type j9987a
module B type j9990a
module D type j9989a
module F type j9993a
no fault-finder broadcast-storm
no fault-finder bad-driver
no fault-finder bad-transceiver
no fault-finder bad-cable
no fault-finder too-long-cable
no fault-finder over-bandwidth
no fault-finder loss-of-link
no fault-finder duplex-mismatch-hdx
no fault-finder duplex-mismatch-fdx
no fault-finder link-flap
trunk B24,F8 trk1 lacp
trunk D1 trk2 dt-trunk
trunk A21-A22 trk3 dt-trunk
trunk B1-B2 trk4 dt-trunk
trunk D3 trk5 dt-lacp
trunk D22 trk10 dt-lacp
trunk B19,D11 trk11 lacp
trunk D23-D24 trk21 lacp
trunk B20,D12 trk144 lacp
mac-age-time 60
timesync sntp
sntp unicast
no telnet-server
telnet-server listen data
time daylight-time-rule western-europe
time timezone 60
web-management listen data
ip arp-age 1
ip ssh listen data
ip route 0.0.0.0 0.0.0.0 [...]
ip routing
switch-interconnect trk1
[...]
oobm
disable
interface disable
no ip address
exit
router vrrp
virtual-ip-ping
ipv4 enable
nonstop
exit
vlan 1
name [...]
no untagged [...],Trk10,Trk144
untagged Trk1, [...]
no ip address
jumbo
disable layer3
exit
[...]
vlan 288
name "[…]"
untagged […],Trk10
tagged […],Trk1
ip address 192.168.4.245 255.255.252.0
ip helper-address 192.168.0.9
jumbo
vrrp vrid 2
virtual-ip-address 192.168.4.244
priority 101
enable
exit
exit
[...]
vlan 4094
name "IT-ICS-Keepalive"
untagged Trk144
ip address 172.31.255.1 255.255.255.252
exit
spanning-tree
spanning-tree Trk1 priority 4
[...]
spanning-tree Trk10 priority 4 bpdu-filter
[...]
spanning-tree Trk144 priority 4
spanning-tree pathcost mstp 8021d
[...]
spanning-tree priority 0 force-version rstp-operation
distributed-trunking peer-keepalive vlan 4094
distributed-trunking peer-keepalive destination 172.31.255.2
distributed-trunking role-priority 1
[...]
A2:
A2# sh run
Running configuration:
; J9850A Configuration Editor; Created on release #KB.16.03.0004
; Ver #10:08.7f.ff.bb.ff.7c.59.fc.7b.ff.ff.fc.ff.ff.3f.ef:52
hostname "A2"
module A type j9987a
module B type j9990a
module D type j9989a
module F type j9993a
no fault-finder broadcast-storm
no fault-finder bad-driver
no fault-finder bad-transceiver
no fault-finder bad-cable
no fault-finder too-long-cable
no fault-finder over-bandwidth
no fault-finder loss-of-link
no fault-finder duplex-mismatch-hdx
no fault-finder duplex-mismatch-fdx
no fault-finder link-flap
trunk B24,F8 trk1 lacp
trunk D1 trk2 dt-trunk
trunk A21-A22 trk3 dt-trunk
trunk B1-B2 trk4 dt-trunk
trunk D3 trk5 dt-lacp
trunk D22 trk10 dt-lacp
trunk B19,D11 trk12 lacp
trunk D23-D24 trk22 lacp
trunk B20,D12 trk144 lacp
mac-age-time 60
timesync sntp
sntp unicast
no telnet-server
telnet-server listen data
time daylight-time-rule western-europe
time timezone 60
web-management listen data
ip arp-age 1
ip ssh listen data
ip route 0.0.0.0 0.0.0.0 [...]
ip routing
switch-interconnect trk1
[...]
oobm
disable
interface disable
no ip address
exit
router vrrp
virtual-ip-ping
ipv4 enable
nonstop
exit
vlan 1
name [...]
no untagged [...],Trk10,Trk144
untagged Trk1, [...]
no ip address
jumbo
disable layer3
exit
[...]
vlan 288
name "[…]"
untagged B14-B18,D8,Trk10
tagged B21,Trk1,Trk12,Trk22
ip address 192.168.4.246 255.255.252.0
ip helper-address 192.168.0.9
jumbo
vrrp vrid 2
virtual-ip-address 192.168.4.244
priority 99
enable
exit
exit
[…]
vlan 4094
name "IT-ICS-Keepalive"
untagged Trk144
ip address 172.31.255.2 255.255.255.252
exit
spanning-tree
spanning-tree Trk1 priority 4
[...]
spanning-tree Trk10 priority 4 bpdu-filter
[...]
spanning-tree Trk144 priority 4
spanning-tree pathcost mstp 8021d
[...]
spanning-tree priority 1 force-version rstp-operation
distributed-trunking peer-keepalive vlan 4094
distributed-trunking peer-keepalive destination 172.31.255.1
distributed-trunking role-priority 2
[...]
B1:
B1# sh run
Running configuration:
; J8697A Configuration Editor; Created on release #K.16.02.0019
; Ver #10:08.01.81.30.02.34.59.2c.6b.ff.f7.fc.7f.ff.3f.ef:24
hostname "B1"
module 1 type j9548a
module 6 type j9537a
trunk A21-A22 trk1 lacp
trunk F23 trk10 dt-lacp
trunk A19-A20 trk144 lacp
[…]
mac-age-time 60
max-vlans 2048
timesync sntp
sntp unicast
[…]
time daylight-time-rule western-europe
time timezone 60
ip arp-age 1
ip default-gateway […]
switch-interconnect trk1
[…]
vlan 1
name "DEFAULT_VLAN"
no untagged […], Trk144
untagged […],Trk1,Trk10
ip address 192.168.4.242 255.255.252.0
jumbo
exit
[…]
vlan 4094
name "IT-ICS-Keepalive"
untagged Trk144
ip address 172.31.255.1 255.255.255.252
exit
spanning-tree
[…]
spanning-tree Trk1 priority 4
spanning-tree Trk10 priority 4 bpdu-filter
spanning-tree Trk144 priority 4
no spanning-tree bpdu-throttle
spanning-tree priority 0 force-version rstp-operation
[…]
distributed-trunking peer-keepalive vlan 4094
distributed-trunking peer-keepalive destination 172.31.255.2
distributed-trunking role-priority 1
[…]
B2:
B2# sh run
Running configuration:
; J8697A Configuration Editor; Created on release #K.16.02.0019
; Ver #10:08.01.81.30.02.34.59.2c.6b.ff.f7.fc.7f.ff.3f.ef:24
hostname "B2"
module 1 type j9548a
module 6 type j9537a
trunk A21-A22 trk1 lacp
trunk F23 trk10 dt-lacp
trunk A19-A20 trk144 lacp
[…]
mac-age-time 60
max-vlans 2048
timesync sntp
sntp unicast
[…]
time daylight-time-rule western-europe
time timezone 60
ip arp-age 1
ip default-gateway […]
switch-interconnect trk1
[…]
vlan 1
name "DEFAULT_VLAN"
no untagged […], Trk144
untagged […],Trk1,Trk10
ip address 192.168.4.243 255.255.252.0
jumbo
exit
[…]
vlan 4094
name "IT-ICS-Keepalive"
untagged Trk144
ip address 172.31.255.2 255.255.255.252
exit
spanning-tree
spanning-tree Trk1 priority 4
spanning-tree Trk10 priority 4 bpdu-filter
spanning-tree Trk144 priority 4
no spanning-tree bpdu-throttle
spanning-tree priority 1 force-version rstp-operation
[…]
distributed-trunking peer-keepalive vlan 4094
distributed-trunking peer-keepalive destination 172.31.255.1
distributed-trunking role-priority 2
[…]
Best Answer
I've also encountered this problem two times allready. Unfortunaltely HPE support was not very helpfull. But my mac-learning problems always related to distributed trunking. My workaround was to step away from dt.
I recommend to use VSF if possible (V3 Modules and zl2 Switch required among other things: https://higherlogicdownload.s3.amazonaws.com/HPE/MigratedAttachments/E8DDA7C0-AFED-4DF4-B5C7-FD71B705C690-2-AOS-Switch_VSF_Configuration_Guide.pdf - Page 3)
I would treat your old 5406(J8697A) switches as separate units (delete dt), as they do not support VSF. STP would have to do the work here.
Even if this solution is not really practical, i can tell that you don't have the wrong configuration.