Juniper JUNOS – Fix Security Policy Blocking Permitted Traffic

firewalljuniper-junosjuniper-srxrouting

Running Juniper SRX device.

Within a routing-instance, I removed two /25 static routes and replaced it with a single /24 static route. This /24 static route starts on the same IP, just a bigger subnet. Note that 172.18.10.10 is a blackhole; 10.10.201.128/25 is NOT being used yet.

Config for routing-instance A:

[omitted]
routing-options {
    static {
        inactive: route 10.212.17.0/24 next-hop 10.212.210.17;
        route 10.10.200.0/25 next-hop 172.18.10.2;
        route 10.10.201.0/25 next-hop 172.18.10.6;
        route 10.10.201.128/25 next-hop 172.18.10.10;
    }
[omitted]

New config for routing-instance A:

[omitted]
routing-options {
    static {
        inactive: route 10.212.17.0/24 next-hop 10.212.210.17;
        route 10.10.200.0/25 next-hop 172.18.10.2;
        route 10.10.201.0/24 next-hop 172.18.10.6;
    }
[omitted]

After making this change, I can no longer route from some places (but not all) to 10.10.201.0/25 hosts. At first I thought this might be caused by a summarization issue, but then I realized that two other random networks within the same routing instance (let us call it routing instance B) and on the same router (not going through ebgp connections from other sites) do not have similar behavior. Specifically, 10.212.200.0/24 cannot reach 10.10.201.0/25, but 10.212.203.0/24 can. Again, both 10.212.200/24 and 10.212.203/24 are coming from the same routing instance on the same physical router.

So…I checked the security flow to look at the traffic, and I found that traffic was able to route from 10.10.201.0/25 to the appropriate hosts, however some hosts were unable to respond with an echo response:

Session ID: 6726, **Policy name: HQ-VPN-DMZ_To_Trust**/30, Timeout: 2, Valid
  In: 10.10.201.3/152 --> 10.212.200.1/63851;icmp, If: ge-0/0/7.0, Pkts: 1, Bytes: **84**
  Out: 10.212.200.1/63851 --> 10.10.201.3/152;icmp, If: ge-0/0/4.0, Pkts: 0, Bytes: **0**

So…I checked the policy hit-count, and saw that the hit-count was increasing. The policy is listed below:

policy HQ-VPN-DMZ_To_Trust {
    match {
        source-address [ *omitted* 10.212.210.18/32 10.10.201.0/25 ];
        destination-address any;
        application any;
    }
    then {
        permit;
    }

Since the source IP is 10.10.201.3, the policy hit-counter should NOT have been increasing.

My question is: How is it possible that the policy was blocking that return echo response from 10.212.200.1 -> 10.10.201.3, when it is clearly permitted?

Also:

I should note that when I added the specific static route…

[omitted]
routing-options {
    static {
        inactive: route 10.212.17.0/24 next-hop 10.212.210.17;
        route 10.10.200.0/25 next-hop 172.18.10.2;
        route 10.10.201.0/24 next-hop 172.18.10.6;
        route 10.10.201.0/25 next-hop 172.18.10.6;
    }
[omitted]

…back into the routing-instance config, everything started working again.

Since we are on this topic, how is it possible that 10.10.201.0/25 static route would work, but 10.10.201.0/24 static route would not work? 10.10.201.0/25 is summarized within 10.10.201.0/24!!!

The show route output differences are shown below (one with the /25 in, and one with the /24 in):

With /24:

x@x> show route 10.10.201.0        

inet.0: 919 destinations, 1825 routes (919 active, 0 holddown, 1 hidden)
+ = Active Route, - = Last Active, * = Both

0.0.0.0/0          *[BGP/80] 18w6d 22:42:17, localpref 100
                      AS path: YYYY AAAA I
                    > to 104.x.x.x via ge-0/0/0.0
                    [BGP/80] 7w4d 10:38:07, localpref 50, from 10.99.1.23
                      AS path: YYYY BBBB I
                    > to 10.99.1.2 via ge-0/0/15.0
                    [OSPF/150] 14w5d 02:01:25, metric 1, tag 0
                    > to 10.99.1.2 via ge-0/0/15.0

DMZ.inet.0: 101 destinations, 101 routes (101 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

10.10.201.0/24     *[Static/5] 00:49:33
                    > to 172.18.10.6 via ge-0/0/7.0

tunnel.inet.0: 90 destinations, 113 routes (73 active, 0 holddown, 25 hidden)
+ = Active Route, - = Last Active, * = Both

0.0.0.0/0          *[BGP/80] 18w6d 22:42:17, localpref 100
                      AS path: YYYY AAAA I
                    > to 104.x.x.x via ge-0/0/0.0

lan-vr.inet.0: 58 destinations, 83 routes (58 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

0.0.0.0/0          *[BGP/80] 18w6d 22:42:17, localpref 100
                      AS path: YYYY AAAA I
                    > to 104.x.x.x via ge-0/0/0.0
                    [OSPF/150] 7w4d 10:38:06, metric 0, tag 0
                    > to 10.212.6.20 via ge-0/0/1.0

With /25:

x@x# run show route 10.10.201.0 

inet.0: 921 destinations, 1829 routes (921 active, 0 holddown, 1 hidden)
+ = Active Route, - = Last Active, * = Both

10.10.201.0/25     *[Static/5] 01:04:04
                    > to 172.18.10.6 via ge-0/0/7.0
                    [OSPF/150] 01:04:02, metric 1, tag 0
                    > to 10.99.1.2 via ge-0/0/15.0

DMZ.inet.0: 102 destinations, 102 routes (102 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

10.10.201.0/25     *[Static/5] 01:04:06
                    > to 172.18.10.6 via ge-0/0/7.0

tunnel.inet.0: 91 destinations, 114 routes (74 active, 0 holddown, 25 hidden)
+ = Active Route, - = Last Active, * = Both

10.10.201.0/25     *[Static/5] 01:04:04
                    > to 172.18.10.6 via ge-0/0/7.0

lan-vr.inet.0: 59 destinations, 84 routes (59 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

10.10.201.0/25     *[Static/5] 01:04:04
                    > to 172.18.10.6 via ge-0/0/7.0

Best Answer

The two routes you are summarising have different destination addresses.

eg:

route 10.10.201.0/25 next-hop 172.18.10.6;
route 10.10.201.128/25 next-hop 172.18.10.10;

after your change, the entire /24 is pointing to 172.18.10.6

The security policy is not blocking the traffic - the fact that you see an entry in the flow table shows that it is being permitted.

What you are seeing in the session table is that the return traffic is not coming back for whatever reason.

Without knowing your topology it's a bit hard to know why, but I would suggest that 10.212.200.1 has a route back to 10.10.201.3 via the router with 172.16.10.10 as an interface. After you make your change, it is possible you create asymmetric routing (the return traffic comes back on a different interface) and the SRX just drops it.

Actually looking at your routing output, I think the problem is that you are leaking routes between routing-instances (most likely via a policy that specifically matches on the /25), and when you put the /24 in, it removes that entry from all your routing-instances. This may explain the other issues you have when you apply this route.

Can you please attach the output of show configuration routing-options and show configuration policy-options? This should confirm the presence of route-leaking.