Linux – DHCP troubleshooting – client timeouts

dhcplinux

I have setup two ubuntu 13.10 serwers – one of them is in a small test network + two more clients, separated from our company's main network but connected to the Internet. This server is my testbox, I'm doing new software testing on it. Has DHCP, PXE and DNS on it.

The other one is in our main network, currently serving as both DHCP server and DNS forwarder.

Problem:

In our big network, DHCP requests tend to timeout before arriving at DHCP server and I don't know why (90% of the time). DHCPDISCOVER entries do not appear on SYSLOG at ubuntu server. As opposed to my small test network, where DHCPDISCOVER is displayed very quickly.

Things I noticed

  • Windows 7 and XP clients, probalby thanks to their persistence, are getting their IP addresses, only after a delay, which normally is not seen on the test network, (or my home network, for that matter).
  • Intel Boot Agent shipped with Dell Optiplex 620 or 755 or 780 times out more often than not
  • CloneZilla, when launched from USB stick, when trying to get IP address from DHCP, starts with TIMEOUT = 3secs and does not succeed, then retries some more times with increasing timeouts and only after timeout is set above 15 or 18 seconds – IP is obtained
  • other programs that try to get an IP, setting timeout of 15 secs or less, do not succeed.

Environment

  • Our Router is only a router – does not serve DHCP nor is it DNS forwarder,
  • Ubuntu box is a separate machine that currently does only these two things,
  • Between machine I am testing and said DHCP server is one switch, CISCO SB SGE2010, Gigabit. Of course there are machines that are two to four switches away,
  • DHCP server's eth0 is 100Mbit,
  • Responses from the same machine to DNS queries are immediate (I'm testing resolving addresses I haven't used in a while),
  • I can ping anything from this server and server from any machine and no packet losses are occuring,
  • our network comprises of about 140 machines, including statically addressed network equipment,
  • We use only one subned of 192.168.0.0/24 and no VLANs (yet),
  • my DHCP server has no pool to assign from, only static leases. I tried adding a pool too, but that didn't change a thing, so back to square one.

EDIT: Ubuntu server reports 9% memory and 1.23% processor and 2.5% hdd usage.

That's about all. I don't know if I should install a sniffer on my ubuntu box to see if DHCP discovery packets are actually arriving, really no idea. Please direct me how to troubleshoot this case.

Best Answer

Enable portfast on all the ports. Without portfast, there is a fairly long delay between when a port comes up versus when it can start passing traffic. This is long enough that most DHCP implementations will time out.