SRX Chassis Cluster – Connecting to Redundant Upstream WAN Links

hsrpjuniper-srxloopwan

I've been tasked with migrating some Juniper SSG 140's in an active/passive cluster to an SRX 340 cluster. I'm very familiar with ScreenOS, but am still learning JunOS.

Currently, we're connected to our upstream provider through a single uplink via a switch, but will be switching to a dual-uplink setup with the SRX:

Current:
                |
 Provider R1 ---|---- switch ---(vlan)- ssg1
                |        ^                 XX
                         ^------(vlan)- ssg2

Future:
                |
 Provider R1 ---|---- srx1 (ge-0/0/7)
      XX        |      XX
 Provider R2 ---|---- srx2 (ge-5/0/7)
                |

The two SRX devices are in a chassis cluster, and I want to connect one to each uplink. However, I'm wary of accidentally creating a switching loop and taking our connection offline.

My provider has told me that they're running HSRP, and gave me three IPs: R1, R2, and the HSRP IP. From what I've read, it seems like only one uplink should be active at a time.

I want to try something like this:

set interfaces ge-0/0/7 gigether-options redundant-parent reth2
set interfaces ge-5/0/7 gigether-options redundant-parent reth2
set interfaces reth2 family inet address 1.2.3.4/26
  • If I put two interfaces (one on each SRX) in a reth, is that sufficient to ensure that there won't be any loops?
  • Is this the correct/optimal way to ensure redundancy with the two upstream links?
  • Is there any benefit to forcing the active SRX chassis to match the active upstream router? If so, how could this be achieved? (interface-monitor doesn't seem like it would work since both interfaces would be up under normal operation)

EDIT: The provider sent me some more information on the configuration from their end. It seems like their routers are dst0009 and dst0010:

dst0009#sh run int vlan586
Building configuration...

Current configuration : 365 bytes
!
interface Vlan586
 description <snip>
 ip address x.y.61.2 255.255.255.240
 no ip unreachables
 no ip proxy-arp
 ip flow ingress
 no ip mroute-cache
 load-interval 30
 ntp disable
 arp timeout 180
 standby 1 ip x.y.61.1
 standby 1 preempt
 standby 1 mac-address 0000.0c00.0586
 standby 1 track Vlan903 20
 standby 1 track Vlan924 20
end

dst0009#sh standby Vlan586
Vlan586 - Group 1
  State is Active
    1 state change, last state change 50w4d
  Virtual IP address is x.y.61.1
  Active virtual MAC address is 0000.0c00.0586
    Local virtual MAC address is 0000.0c00.0586 (cfgd)
  Hello time 3 sec, hold time 10 sec
    Next hello sent in 0.192 secs
  Preemption enabled
  Active router is local
  Standby router is x.y.61.3, priority 60 (expires in 8.848 sec)
  Priority 100 (default 100)
    Track interface Vlan903 state Up decrement 20
    Track interface Vlan924 state Up decrement 20
  Group name is "hsrp-Vl586-1" (default)

dst0009#sho vlan id 586

VLAN Name                             Status    Ports
---- -------------------------------- --------- -------------------------------
586  <snip>                           active    Po1, Fa9/89

=========================================================
dst0010#sh run int vlan586
Building configuration...

Current configuration : 365 bytes
!
interface Vlan586
 description <snip>
 ip address x.y.61.3 255.255.255.240
 no ip unreachables
 no ip proxy-arp
 ip flow ingress
 no ip mroute-cache
 load-interval 30
 ntp disable
 arp timeout 180
 standby 1 ip x.y.61.1
 standby 1 preempt
 standby 1 mac-address 0000.0c00.0586
 standby 1 track Vlan903 20
 standby 1 track Vlan924 20
end

dst0010#sh standby Vlan586
Vlan586 - Group 1
  State is Standby
    1 state change, last state change 50w4d
  Virtual IP address is x.y.61.1
  Active virtual MAC address is 0000.0c00.0586
    Local virtual MAC address is 0000.0c00.0586 (cfgd)
  Hello time 3 sec, hold time 10 sec
    Next hello sent in 0.544 secs
  Preemption enabled
  Active router is x.y.61.2, priority 100 (expires in 9.888 sec)
  Standby router is local
  Priority 60 (default 100)
    Track interface Vlan903 state Down decrement 20
    Track interface Vlan924 state Down decrement 20
  Group name is "hsrp-Vl586-1" (default)

Best Answer

If I put two interfaces (one on each SRX) in a reth, is that sufficient to ensure that there won't be any loops?

Yes* - reth interfaces are essentially a logical L3 interface - they will not loop traffic between node0 and node1 of the SRX chassis cluster

Is this the correct/optimal way to ensure redundancy with the two upstream links?

No. Ask your provider for two independent L3 links.

*The problem with the topology as drawn is that it looks like your provider is relying on you to have a switch to plug their two links into so that their HSRP heartbeat will pass between their two edge devices.

If you install an L3 device between them (eg: the SRX cluster) I suspect that the HSRP fail-over will never work (it will probably go active/active).

If your provider can't/won't provide this, I'd try another provider, but if you have L8 issues preventing this, then your best bet would be an individual L2 switch for each provider, a dedicated VLAN trunked across both for the provider network, and a link from node0 into Switch A and a link from node1 into Switch B eg:

Provider R1 -----|------Switch A-------SRX:node0-----
                 |         |      reth0   ||    reth1
Provider R2 -----|------Switch B-------SRX:node1-----

I would also implement LACP on the uplinks between the SRX nodes and Switch A / Switch B - this will ensure that the SRX knows the status of the link between itself and each switch at all times. Don't worry if each switch is independent - just enable a 1-member LACP bundle off each switch to the SRX and then use interface-monitor reth0 under the redundancy-group stanza.

An example of this here:

http://blog.eighthlayer.io/failure-is-optional/

(Full Disclosure: my blog)

Is there any benefit to forcing the active SRX chassis to match the active upstream router? If so, how could this be achieved? (interface-monitor doesn't seem like it would work since both interfaces would be up under normal operation)

Because of the above issue around L2 and HSRP heartbeats, this will not work. You are also correct - interface-monitor is only an L1 monitoring solution.

If HSRP fails over upstream for any reason other than link/node failure, the SRX will continue to forward out node0 and all your traffic will be dropped.

Continuing on from above - if I was ordering redundant links from a provider, I would want them to be redundant at L1, L2 and L3, which the above are not.

What you really want are non-reth interfaces facing two independent L3 circuits, one attached to each node - then you can use BGP to handle provider fail-over, and leave the chassis-cluster fail-over to handle device-level failures, independent of the upstream provider.