Nova network DHCP not releasing ip with force_dhcp_release=True

dhcpdnsmasqnova-networkopenstackopenstack-nova

The problem I am seeing is that when nova-network calls dhcprelease upon
instance termination (due to forcedhcp_release=True) the address is not
always released(syslog are not showing DHCPRELEASE request). Then later on if nova assigned a new instance the same ip address that was not released the DHCP request is ignored and an error in the syslog will show that dnsmasq saw the request and refused because the ip address was already leased to a different MAC address (the one belonging to the old, terminated, VM).

Some details about my setup:

  • Juno release
  • legacy (nova-) network
  • Ubuntu 14.04
  • DHCP handled by DNSMASQ.

When hosts are able to get their ip address from the DHCP server everything
appears to be working perfectly fine. It seems as though the error is only
when an ip fails to be released and blocks subsequent use of it for future
VM's.

I checked for any errors in my nova-* logs and don't see any. The only
errors are in my syslog when dnsmasq refuses to lease the ip address due to
the conflicting MAC addresses
.

Any info or suggestions would be much appreciated.

Best Answer

Still I am not able to find perfect solution for this issue but I have found the problem area and have some workaround.

1. Problem Area: Problem lays in dnsmasq not in OpenStack. I have observed that OpenStack is executing “dhcprelease” function after terminating instance each time but dnsmasq is responding to only few dhcprelease requests.

2. Workaround: Default lease time of any lP is 24 Hrs (86400 Seconds) which means each instance must renew its lease after each 24 hrs. If instance do not renew its lease of IP then dnsmasq will consider that lease invalid and release the IP obtained by that lease.
I have reduced that lease to 3 minutes(180 Seconds). So any lease will not be able to hold the IP more than 3 minutes after terminating the instance.

Steps to reduce lease time to 3 minutes:

Perform following steps on all compute nodes one by one.

  1. Open a file /etc/nova/nova.conf

vi /etc/nova/nova.conf

  1. In [DEFAULT] section configure dhcp_lease_time.
    It is in seconds.

[DEFAULT]
...
dhcp_lease_time = 180

  1. Save and quit file.

  2. Kill dnsmasq process for each bridge on server. (OR you can perform killall also if dnsmasq is not used for any other purposes)

kill (OR killall dnsmasq)

  1. Restart services.

restart nova-api-metadata
restart nova-compute
restart nova-network