Juniper SRX Issue – No Traffic After 17 Minutes

ciscojuniper-junossrx

This is really a strange problem.

I am trying to install a Juniper SRX 220H as the gateway to replace my old Cisco router in my testing network environment. The simplified topology is listed below:

    ISP ----- ONT ----- SRX ----- Other devices (Routers, switches, client computers...)

The link from ISP to SRX is a 802.1Q trunk link contains two VLANs (VLAN 35 for Internet access, IP address assigned by DHCP, lease for 20min. VLAN 34 for IPTV which is not used here).

SRX can obtain IP address from ISP at first. After 12 to 17 minutes (after first DHCP lease renew and before second renew), SRX lost Internet access (cannot ping the gateway). There is nothing special or even a notice in the system log or system status. "show interface" said everything works fine. But no traffic at all in ge-0/0/0. If I unplug the cable or reboot SRX, it works for another 12 to 17 minutes and then all traffic are stopped again.

Before I install this SRX, the old Cisco router with same configuration works without any problems.

Any clues?

Partial configuration of SRX is listed below:

interfaces {
    ge-0/0/0 {
        unit 0 {
            family ethernet-switching {
                port-mode trunk;
                vlan {
                    members vlan-internet;
                }
            }
        }
    }
    ge-0/0/1 {
        unit 0 {
            family ethernet-switching {
                vlan {
                    members vlan-trust;
                }
            }
        }
    vlan {
        mac xx:xx:xx:xx:xx:xx;
        unit 0 {
            family inet {
                address 192.168.99.254/24;
            }
        }
        unit 35 {
            family inet {
                dhcp;
            }
        }
    }
    }                                   

vlans {
    vlan-internet {
        vlan-id 35;
        l3-interface vlan.35;
    }
    vlan-trust {
        vlan-id 3;
        l3-interface vlan.0;
    }
}

Corresponding Cisco configuration:

interface GigabitEthernet0/0
 description WAN
 mac-address xxxx.xxxx.xxxx
 no ip address
 duplex auto
 speed auto
 media-type rj45
 no negotiation auto
!
interface GigabitEthernet0/0.35
 description FibreOP-Internet
 encapsulation dot1Q 35
 ip address dhcp
 ip nat outside
 ip virtual-reassembly in
!

EDIT:

I replaced the cable which connects ISP and SRX ge-0/0/0. Nothing good.

EDIT2:

I configured a spare Cisco switch to simulate my ISP environment. VLAN trunk and same DHCP lease term are set. Then I connect ge-0/0/0 of SRX to that switch. The configuration of SRX is kept. In this experiment, SRX behavior normal. This makes me really confused.

EDIT3:

Output requested by @ryanklein

root@Firewall> show dhcp client statistics                   
warning: dhcp-service subsystem not running - not needed by configuration.

root@Firewall> show dhcp client binding 
warning: dhcp-service subsystem not running - not needed by configuration.

EDIT4:

Output requested by @ryanklein

root@Firewall> show system services dhcp statistics 
Packets dropped:
    Total                      0

Messages received:
    BOOTREQUEST                0
    DHCPDECLINE                0
    DHCPDISCOVER               0
    DHCPINFORM                 0
    DHCPRELEASE                0
    DHCPREQUEST                0

Messages sent:
    BOOTREPLY                  0
    DHCPOFFER                  0
    DHCPACK                    0
    DHCPNAK                    0

root@Firewall> show system services dhcp client 

 Logical Interface name         vlan.35
        Hardware address        xx:xx:xx:xx:xx:xx
        Client status           bound
        Address obtained        142.xxx.xxx.xxx
        Update server           disabled
        Lease obtained at       2015-01-18 03:35:47 NST
        Lease expires at        2015-01-18 03:55:47 NST

DHCP options:
    Code: 1, Type: ip-address, Value: 255.255.252.0
    Name: server-identifier, Value: 142.yyy.yyy.yyy
    Name: router, Value: [ 142.xxx.xxx.1 ]
    Name: name-server, Value: [ 47.55.55.55, 142.166.166.166 ]

root@Firewall> 

EDIT5:

I captured the data between SRX and ISP and found something may be useful.

The topology diagram is updated (added missing ONT device between SRX and ISP).

  • When Internet is gone, layer 2 communications between ONT and SRX is still active. It seems that the ONT keep sending ARP query requests to the IP addresses my ISP assigned to SRX. After the gone of Internet, I can still see those ARP requests and responses. I think this behavior indicates that the ONT is not the root of this problem.

  • When Internet is gone, DHCP request packets will get no responses just like other traffic. I tried to renew my IP address on SRX after Internet is gone but failed. The captured data shows that no responses from remote side.

  • DHCP renew is succeed when Internet is normal. When I issued "request system service dhcp client renew vlan.35", I can see DHCP request and corresponding DHCP ACK.

  • (INCORRECT. See next item)When Internet is gone, release current DHCP lease and ask for a new one will restore the connectivity. I tried to release the current DHCP lease and renew it, then SRX got a new IP address and the Internet is back. Captured data shows that although a single DHCP request packet gets no response (see above), a DHCP release packet results a DHCP NAK response. After that, a DHCP discover is issued and get correct DHCP offer. Then the Internet is back. However, when I tried to repeat this result, nothing good: neither DHCP release nor DHCP discover gets responses. I issued release and renew command just after a renew command. I'm not sure the unresponding behavior is caused by sending those packets too quick or not.

  • After Internet is gone, issue a DHCP discover request and processed as first time request will restore the Internet connection. Captured data show that when Internet is gone, issuing a DHCP request packet will results a DHCP NAK response. Combine with the result from previous item, issuing both DHCP request and DHCP release will result in DHCP NAK. However, process as if it is the first time to request a IP address from DHCP server (send DHCP discover then DHCP request) will get positive result and restore the Internet access. UPDATED: It seems the NAK is not always sent… sometimes DHCP request/release returned with DHCP NAK, sometime just silence…

Best Answer

I solved this problem finally. By capturing and comparing DHCP Discover packets sent from JunOS and Cisco, I found Cisco sends Option 64 Client Identifier and Option 12 Host Name by default. However, JunOS will not send them without explicit instructions.

I think my ISP sets up a filter or something on their side. The above two options are mandatory. When I configure my SRX to send them, everything went through.