Ipsec vpn with racoon drops traffic on phase 1 renegotiation

ipsecracoon

We are running racoon on Linux connecting to a Checkpoint firewall. The connection comes up fine, but we see an interruption to traffic every 24 hours, corresponding to Phase 1 regenogiation.

Our setup is as follows:

Local side

racoon from ipsec-tools 0.8.0 installed from RPM on Amazon Linux.

Local IP: 10.130.0.253
Local subnet: 10.130.0.252/30

This is running inside an AWS VPC, i.e. it's on a private subnet. So we have NAT traversal enabled. We are binding the VPN to a sub-interface and using iptables to translate the vpn connection to the primary address and NAT traffic destined for the remote LAN to the subinterface. This allows the host to act as a VPN gateway for other hosts, which works well. Iptables rules:

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination         
SNAT       all  --  10.130.0.253         2.2.2.2       to:10.100.200.112
SNAT       all  --  0.0.0.0/0            10.128.80.0/24       to:10.130.0.253

Remote side:

VPN Gateway IP: 2.2.2.2 (anonymised)
Remote subnet: 10.128.80.0/24

Our local configuration is as follows:

/etc/racoon/racoon.conf:

# Racoon IKE daemon configuration file.
# See 'man racoon.conf' for a description of the format and entries.

log debug2;

path include "/etc/racoon";
path pre_shared_key "/etc/racoon/psk.txt";
path certificate "/etc/racoon/certs";
path script "/etc/racoon/scripts";


# Listen on sub-interface - initial connection to establish tunnel is translated to primary IP by iptables
listen {
  isakmp 10.130.0.253 [500];
  isakmp_natt 10.130.0.253 [4500];
}

timer {
 natt_keepalive 1 minute ;
}


# CP VPN-1
remote 2.2.2.2
{
  exchange_mode main;
  lifetime time 24 hour;

  nat_traversal on;

  dpd_delay 20;


  proposal {
    encryption_algorithm 3des;
    hash_algorithm sha1;
    authentication_method pre_shared_key;
    dh_group 2;
  }
}

# net-to-net
sainfo address 10.130.0.252/30 any address 10.128.80.0/24 any
{
  pfs_group 2;
        lifetime time 1 hour;
        encryption_algorithm 3des;
        authentication_algorithm hmac_sha1 ;
        compression_algorithm deflate ;
}

# gateway to gateway
sainfo address 10.130.0.253/32 any address 2.2.2.2/32 any
{
        lifetime time 1 hour;
        encryption_algorithm 3des;
        authentication_algorithm hmac_sha1 ;
        compression_algorithm deflate ;
}

/etc/racoon/setkey.sh

#!/sbin/setkey -f

# First of all flush the SPD database
flush;
spdflush;

# Gateway to Gateway
spdadd 10.130.0.253 2.2.2.2 any -P out ipsec esp/tunnel/10.130.0.253-2.2.2.2/unique;
spdadd 2.2.2.2 10.130.0.253 any -P in  ipsec esp/tunnel/2.2.2.2-10.130.0.253/unique;


# Linux-racoon -> CP VPN-1
spdadd 10.130.0.252/30 10.128.80.0/24 any -P out ipsec esp/tunnel/10.130.0.253-2.2.2.2/unique;

# CP VPN-1 > Linux-racoon
spdadd 10.128.80.0/24 10.130.0.252/30 any -P in  ipsec esp/tunnel/2.2.2.2-10.130.0.253/unique;

We've been getting some alerts on our monitoring of the VPN recently so I set up a more detailed monitoring script which connects to the remote server every minute. It seems that we get significant downtime every 24 hours. My script shows when the connection goes down and when it comes back:

Fri Jan 18 20:24:33 UTC 2013 Connection went down
Fri Jan 18 20:48:36 UTC 2013 Connection came up

Sat Jan 19 20:48:36 UTC 2013 Connection went down
Sat Jan 19 21:00:40 UTC 2013 Connection came up

Sun Jan 20 21:00:38 UTC 2013 Connection went down
Sun Jan 20 21:12:43 UTC 2013 Connection came up

As you can see, the connection goes down exactly 24 hours after it last came up.

These interruptions seem to correspond to Phase 1 renegotiations in the VPN log, which makes sense as the Phase 1 lifetime is 24 hours:

Friday:

Jan 18 20:24:32 ip-10-100-200-112 racoon: INFO: ISAKMP-SA expired 10.130.0.253[500]-2.2.2.2[500] spi:13b2510d0bc467f9:ff649237b81a65b7
Jan 18 20:24:32 ip-10-100-200-112 racoon: INFO: ISAKMP-SA deleted 10.130.0.253[500]-2.2.2.2[500] spi:13b2510d0bc467f9:ff649237b81a65b7
Jan 18 20:36:34 ip-10-100-200-112 racoon: INFO: IPsec-SA expired: ESP/Tunnel 2.2.2.2[500]->10.130.0.253[500] spi=213727991(0xcbd3af7)
Jan 18 20:36:34 ip-10-100-200-112 racoon: INFO: IPsec-SA expired: ESP/Tunnel 10.130.0.253[500]->2.2.2.2[500] spi=3400029604(0xcaa855a4)
Jan 18 20:48:34 ip-10-100-200-112 racoon: INFO: IPsec-SA expired: ESP/Tunnel 2.2.2.2[500]->10.130.0.253[500] spi=213727991(0xcbd3af7)
Jan 18 20:48:34 ip-10-100-200-112 racoon: INFO: IPsec-SA expired: ESP/Tunnel 10.130.0.253[500]->2.2.2.2[500] spi=3400029604(0xcaa855a4)
Jan 18 20:48:34 ip-10-100-200-112 racoon: INFO: IPsec-SA request for 2.2.2.2 queued due to no phase1 found.
Jan 18 20:48:34 ip-10-100-200-112 racoon: INFO: initiate new phase 1 negotiation: 10.130.0.253[500]<=>2.2.2.2[500]
Jan 18 20:48:34 ip-10-100-200-112 racoon: INFO: begin Identity Protection mode.
Jan 18 20:48:34 ip-10-100-200-112 racoon: INFO: ISAKMP-SA established 10.130.0.253[500]-2.2.2.2[500] spi:c4978718cd291fde:01245a461d26cc34
Jan 18 20:48:35 ip-10-100-200-112 racoon: INFO: initiate new phase 2 negotiation: 10.130.0.253[500]<=>2.2.2.2[500]
Jan 18 20:48:35 ip-10-100-200-112 racoon: INFO: IPsec-SA established: ESP/Tunnel 10.130.0.253[500]->2.2.2.2[500] spi=264213233(0xfbf92f1)
Jan 18 20:48:35 ip-10-100-200-112 racoon: INFO: IPsec-SA established: ESP/Tunnel 10.130.0.253[500]->2.2.2.2[500] spi=919162535(0x36c94ea7)

Saturday:

Jan 19 20:48:34 ip-10-100-200-112 racoon: INFO: ISAKMP-SA expired 10.130.0.253[500]-2.2.2.2[500] spi:c4978718cd291fde:01245a461d26cc34
Jan 19 20:48:34 ip-10-100-200-112 racoon: INFO: ISAKMP-SA deleted 10.130.0.253[500]-2.2.2.2[500] spi:c4978718cd291fde:01245a461d26cc34
Jan 19 20:48:36 ip-10-100-200-112 racoon: INFO: IPsec-SA expired: ESP/Tunnel 2.2.2.2[500]->10.130.0.253[500] spi=229822093(0xdb2ce8d)
Jan 19 20:48:36 ip-10-100-200-112 racoon: INFO: IPsec-SA expired: ESP/Tunnel 10.130.0.253[500]->2.2.2.2[500] spi=2536548534(0x9730a8b6)
Jan 19 21:00:36 ip-10-100-200-112 racoon: INFO: IPsec-SA expired: ESP/Tunnel 2.2.2.2[500]->10.130.0.253[500] spi=229822093(0xdb2ce8d)
Jan 19 21:00:36 ip-10-100-200-112 racoon: INFO: IPsec-SA expired: ESP/Tunnel 10.130.0.253[500]->2.2.2.2[500] spi=2536548534(0x9730a8b6)
Jan 19 21:00:37 ip-10-100-200-112 racoon: INFO: IPsec-SA request for 2.2.2.2 queued due to no phase1 found.
Jan 19 21:00:37 ip-10-100-200-112 racoon: INFO: initiate new phase 1 negotiation: 10.130.0.253[500]<=>2.2.2.2[500]
Jan 19 21:00:37 ip-10-100-200-112 racoon: INFO: begin Identity Protection mode.
Jan 19 21:00:38 ip-10-100-200-112 racoon: INFO: ISAKMP-SA established 10.130.0.253[500]-2.2.2.2[500] spi:8b7e98a2cc9d55cb:0b4e8a4cbca2ada9
Jan 19 21:00:38 ip-10-100-200-112 racoon: INFO: initiate new phase 2 negotiation: 10.130.0.253[500]<=>2.2.2.2[500]
Jan 19 21:00:39 ip-10-100-200-112 racoon: INFO: IPsec-SA established: ESP/Tunnel 10.130.0.253[500]->2.2.2.2[500] spi=111999639(0x6acfa97)
Jan 19 21:00:39 ip-10-100-200-112 racoon: INFO: IPsec-SA established: ESP/Tunnel 10.130.0.253[500]->2.2.2.2[500] spi=577442054(0x226b1106)

Sunday:

Jan 20 21:00:38 ip-10-100-200-112 racoon: INFO: ISAKMP-SA expired 10.130.0.253[500]-2.2.2.2[500] spi:8b7e98a2cc9d55cb:0b4e8a4cbca2ada9
Jan 20 21:00:38 ip-10-100-200-112 racoon: INFO: ISAKMP-SA deleted 10.130.0.253[500]-2.2.2.2[500] spi:8b7e98a2cc9d55cb:0b4e8a4cbca2ada9
Jan 20 21:00:39 ip-10-100-200-112 racoon: INFO: IPsec-SA expired: ESP/Tunnel 2.2.2.2[500]->10.130.0.253[500] spi=131435403(0x7d58b8b)
Jan 20 21:00:39 ip-10-100-200-112 racoon: INFO: IPsec-SA expired: ESP/Tunnel 10.130.0.253[500]->2.2.2.2[500] spi=272995718(0x10459586)
Jan 20 21:12:39 ip-10-100-200-112 racoon: INFO: IPsec-SA expired: ESP/Tunnel 2.2.2.2[500]->10.130.0.253[500] spi=131435403(0x7d58b8b)
Jan 20 21:12:39 ip-10-100-200-112 racoon: INFO: IPsec-SA expired: ESP/Tunnel 10.130.0.253[500]->2.2.2.2[500] spi=272995718(0x10459586)
Jan 20 21:12:40 ip-10-100-200-112 racoon: INFO: IPsec-SA request for 2.2.2.2 queued due to no phase1 found.
Jan 20 21:12:40 ip-10-100-200-112 racoon: INFO: initiate new phase 1 negotiation: 10.130.0.253[500]<=>2.2.2.2[500]
Jan 20 21:12:40 ip-10-100-200-112 racoon: INFO: begin Identity Protection mode.
Jan 20 21:12:40 ip-10-100-200-112 racoon: INFO: ISAKMP-SA established 10.130.0.253[500]-2.2.2.2[500] spi:e6d2b9ccb25f4992:31807020144b9a1e
Jan 20 21:12:41 ip-10-100-200-112 racoon: INFO: initiate new phase 2 negotiation: 10.130.0.253[500]<=>2.2.2.2[500]
Jan 20 21:12:41 ip-10-100-200-112 racoon: INFO: IPsec-SA established: ESP/Tunnel 10.130.0.253[500]->2.2.2.2[500] spi=179370287(0xab0f92f)
Jan 20 21:12:41 ip-10-100-200-112 racoon: INFO: IPsec-SA established: ESP/Tunnel 10.130.0.253[500]->2.2.2.2[500] spi=1696204357(0x651a0645)

So it seems that the phase 1 renegotiation takes at least 12 minutes. Does anyone know why this might be and what we can do to fix it so we can have uninterrupted traffic over the VPN?

Best Answer

If you enable dead peer detection, Racoon should detect phase 1 expiration, and renegotiate automatically, assuming I've diagnosed the problem correctly.

By default, dpd is disabled;

dpd_delay 0; is the default.

Setting a reasonable number, in seconds, between dpd checks, will enable it;

dpd_delay 30;

Now to try this in anger on the my ipsec vpn which was doing the same thing.