Bind9 recursive queries have intermittent very long response times

bindperformance-tuningtroubleshootingubuntu-12.04

I'm running BIND 9.8.1-P1 caching server on Ubuntu 12.04 LTS on kernel 3.2.0-86-generic-pae.

I've been banging my head against the wall trying to figure this out for weeks and I just have no idea what is going on and am having a hard time even recreating the problem as it is very intermittent.

What is happening is that when loading a page such as www.msn.com in Firefox with Firefox's Network Monitor open I occasionally see a few requests that have the DNS resolution part of their timing anywhere from 8000ms to as high as 37000ms resolution times. Of course once I reload the page it is cached and then works fine but even if I flush the cache on both the server and the client suddenly that domain is no longer a problem. I haven't been able to see any pattern in which domains have an issue. I just have no idea how to even being troubleshooting this because it is so hard to replicate for the same domain twice.

Here is all I have in named.conf.options

options {
        directory "/var/cache/bind";

        // If there is a firewall between you and nameservers you want
        // to talk to, you may need to fix the firewall to allow multiple
        // ports to talk.  See http://www.kb.cert.org/vuls/id/800113

        // If your ISP provided one or more IP addresses for stable
        // nameservers, you probably want to use them as forwarders.
        // Uncomment the following block, and insert the addresses replacing
        // the all-0's placeholder.

        forwarders {
                //OpenDNS
                208.67.222.222;
                208.67.220.220;
        };

        auth-nxdomain no;    # conform to RFC1035
        //listen-on-v6 { any; };
};

I'm really having a hard time figuring out what is going on let alone how to fix it so I hope someone here can help.

As requested here is the log records generated when turning query logging on and then loading www.wsj.com. Only 4 requests has slow DNS responses

tags.bluekai.com    11986ms

Jul 30 17:50:14 fs named[23539]: client 192.168.16.24#55215: query: tags.bluekai.com IN A + (192.168.16.3)
Jul 30 17:50:15 fs named[23539]: client 192.168.16.24#55215: query: tags.bluekai.com IN A + (192.168.16.3)
Jul 30 17:50:16 fs named[23539]: client 192.168.16.24#55215: query: tags.bluekai.com IN A + (192.168.16.3)
Jul 30 17:50:18 fs named[23539]: client 192.168.16.24#55215: query: tags.bluekai.com IN A + (192.168.16.3)
Jul 30 17:50:22 fs named[23539]: client 192.168.16.24#55215: query: tags.bluekai.com IN A + (192.168.16.3)

comcluster.cxense.com   24004ms

Jul 30 17:38:32 fs named[23539]: client 192.168.16.24#52469: query: comcluster.cxense.com IN A + (192.168.16.3)
Jul 30 17:50:14 fs named[23539]: client 192.168.16.24#56490: query: comcluster.cxense.com IN A + (192.168.16.3)
Jul 30 17:50:14 fs named[23539]: client 192.168.16.24#54692: query: comcluster.cxense.com IN A + (192.168.16.3)
Jul 30 17:50:15 fs named[23539]: client 192.168.16.24#54692: query: comcluster.cxense.com IN A + (192.168.16.3)    
Jul 30 17:50:15 fs named[23539]: client 192.168.16.24#56490: query: comcluster.cxense.com IN A + (192.168.16.3)
Jul 30 17:50:16 fs named[23539]: client 192.168.16.24#54692: query: comcluster.cxense.com IN A + (192.168.16.3)    
Jul 30 17:50:16 fs named[23539]: client 192.168.16.24#56490: query: comcluster.cxense.com IN A + (192.168.16.3)
Jul 30 17:50:18 fs named[23539]: client 192.168.16.24#54692: query: comcluster.cxense.com IN A + (192.168.16.3)    
Jul 30 17:50:18 fs named[23539]: client 192.168.16.24#56490: query: comcluster.cxense.com IN A + (192.168.16.3)
Jul 30 17:50:22 fs named[23539]: client 192.168.16.24#56490: query: comcluster.cxense.com IN A + (192.168.16.3)    
Jul 30 17:50:22 fs named[23539]: client 192.168.16.24#54692: query: comcluster.cxense.com IN A + (192.168.16.3)
Jul 30 17:50:26 fs named[23539]: client 192.168.16.24#56963: query: comcluster.cxense.com IN A + (192.168.16.3)
Jul 30 17:50:27 fs named[23539]: client 192.168.16.24#56963: query: comcluster.cxense.com IN A + (192.168.16.3)
Jul 30 17:50:28 fs named[23539]: client 192.168.16.24#56963: query: comcluster.cxense.com IN A + (192.168.16.3)
Jul 30 17:50:28 fs named[23539]: client 192.168.16.24#56963: query: comcluster.cxense.com IN A + (192.168.16.3)
Jul 30 17:50:30 fs named[23539]: client 192.168.16.24#56963: query: comcluster.cxense.com IN A + (192.168.16.3)
Jul 30 17:50:34 fs named[23539]: client 192.168.16.24#56963: query: comcluster.cxense.com IN A + (192.168.16.3)

cdn.cxpublic.com    23509ms

Jul 30 17:50:26 fs named[23539]: client 192.168.16.24#57469: query: cdn.cxpublic.com IN A + (192.168.16.3)
Jul 30 17:50:27 fs named[23539]: client 192.168.16.24#57469: query: cdn.cxpublic.com IN A + (192.168.16.3)
Jul 30 17:50:28 fs named[23539]: client 192.168.16.24#57469: query: cdn.cxpublic.com IN A + (192.168.16.3)
Jul 30 17:50:30 fs named[23539]: client 192.168.16.24#57469: query: cdn.cxpublic.com IN A + (192.168.16.3)
Jul 30 17:50:34 fs named[23539]: client 192.168.16.24#57469: query: cdn.cxpublic.com IN A + (192.168.16.3)

www.facebook.com    33666ms

Jul 30 17:39:37 fs named[23539]: client 192.168.16.24#54063: query: www.facebook.com IN A + (192.168.16.3)
Jul 30 17:50:26 fs named[23539]: client 192.168.16.24#63809: query: www.facebook.com IN A + (192.168.16.3)
Jul 30 17:50:26 fs named[23539]: client 192.168.16.24#61280: query: www.facebook.com IN A + (192.168.16.3)
Jul 30 17:50:27 fs named[23539]: client 192.168.16.24#63809: query: www.facebook.com IN A + (192.168.16.3)
Jul 30 17:50:27 fs named[23539]: client 192.168.16.24#61280: query: www.facebook.com IN A + (192.168.16.3)
Jul 30 17:50:28 fs named[23539]: client 192.168.16.24#63809: query: www.facebook.com IN A + (192.168.16.3)
Jul 30 17:50:28 fs named[23539]: client 192.168.16.24#61280: query: www.facebook.com IN A + (192.168.16.3)
Jul 30 17:50:30 fs named[23539]: client 192.168.16.24#63809: query: www.facebook.com IN A + (192.168.16.3)
Jul 30 17:50:30 fs named[23539]: client 192.168.16.24#61280: query: www.facebook.com IN A + (192.168.16.3)
Jul 30 17:50:34 fs named[23539]: client 192.168.16.24#63809: query: www.facebook.com IN A + (192.168.16.3)
Jul 30 17:50:34 fs named[23539]: client 192.168.16.24#61280: query: www.facebook.com IN A + (192.168.16.3)
Jul 30 17:50:38 fs named[23539]: client 192.168.16.24#64685: query: www.facebook.com IN A + (192.168.16.3)
Jul 30 17:50:39 fs named[23539]: client 192.168.16.24#64685: query: www.facebook.com IN A + (192.168.16.3)
Jul 30 17:50:40 fs named[23539]: client 192.168.16.24#64685: query: www.facebook.com IN A + (192.168.16.3)
Jul 30 17:50:42 fs named[23539]: client 192.168.16.24#64685: query: www.facebook.com IN A + (192.168.16.3)
Jul 30 17:50:46 fs named[23539]: client 192.168.16.24#64685: query: www.facebook.com IN A + (192.168.16.3)
Jul 30 17:50:50 fs named[23539]: client 192.168.16.24#57121: query: www.facebook.com IN A + (192.168.16.3)
Jul 30 17:50:51 fs named[23539]: client 192.168.16.24#57121: query: www.facebook.com IN A + (192.168.16.3)
Jul 30 17:50:52 fs named[23539]: client 192.168.16.24#57121: query: www.facebook.com IN A + (192.168.16.3)
Jul 30 17:50:54 fs named[23539]: client 192.168.16.24#57121: query: www.facebook.com IN A + (192.168.16.3)

Best Answer

Thank you to everyone who contributed to helping me solve this but I appear to have accidentally solved the problem. Replacing the router appears to have eliminated the problem entirely. I have done extensive testing loading various pages from the Alexa 500 in Firefox and I am no longer able to recreate the problem at all. The router I replaced was very old and I wasn't replacing it in an attempt to solve this problem but it appears that the old router was the source of the issue I was experiencing.

For the record the old router was a Netgear WGR614v5 and I replaced it with a TP-LINK Archer C7.

Related Topic