Linux – Losing connectivity with DNS

binddebiandomain-name-systemlinux

I have set up an 'internal' DNS at my work, basically we have an example.com domain name that is for internet, email etc and I have created on one of our linux network servers (debian) a DNS using bind9 with the domain example.inc.

So based on my files below and the symptoms I'm describing; What can I do to fix this?

These are the critical (I think) files I have modified:

named.conf.local

zone "example.inc" {
        type master;
        file "/etc/bind/zones/example.inc.db";
};
zone "201.168.192.in-addr.arpa" {
        type master;
        file "/etc/bind/zones/rev.201.168.192.in-addr.arpa";
};

named.conf.options

options {
        directory "/var/cache/bind";

        // If there is a firewall between you and nameservers you want
        // to talk to, you may need to fix the firewall to allow multiple
        // ports to talk.  See http://www.kb.cert.org/vuls/id/800113

        // If your ISP provided one or more IP addresses for stable
        // nameservers, you probably want to use them as forwarders.
        // Uncomment the following block, and insert the addresses replacing
        // the all-0's placeholder.

        forwarders {
                1.2.3.4; //IP of our external DNS provider
        };

        auth-nxdomain no;    # conform to RFC1035
        listen-on-v6 { any; };
};

example.inc.db

$TTL 86400
example.inc.      IN      SOA     ns1.ipower.com. admin.example.inc. (
                                                        2006081401
                                                        28800
                                                        3600
                                                        604800
                                                        38400
)
serv1                IN      A               192.168.201.223
serv2                IN      A               192.168.201.220
serv3         IN      A               192.168.201.219
ns1.ipower.com.      IN      A               1.2.3.4
ns2.ipower.com.      IN      A               1.2.3.5
@                    IN      NS              ns1.ipower.com.
@                    IN      NS              ns2.ipower.com.
svn                  IN      CNAME           serv1
docs                 IN      CNAME           serv2
jira                 IN      CNAME           serv3
confluence           IN      CNAME           serv3
fisheye              IN      CNAME           serv3

rev.201.168.192.in-addr.arpa

$TTL 86400
201.168.192.in-addr.arpa. IN SOA ns1.ipower.com. admin.example.inc. (
                        2006081401;
                        28800;
                        604800;
                        604800;
                        86400
)

223                    IN    PTR    serv1
@                      IN    NS     ns1.ipower.com.
@                      IN    NS     ns2.ipower.com.

named.conf

include "/etc/bind/named.conf.options";
include "/etc/bind/named.conf.local";
include "/etc/bind/named.conf.default-zones";

I then made our internal DNS my preferred DNS with the two external DNSs the next in-line. More the most part this seems to work, I can ping svn.example.inc and it resolves to the correct IP, I can also ping google.com and it also resolves no problem. So all seem good.

However, periodically (couple of times a day at least), I lose the ability to ping the svn.example.inc (and all others defined under the internal DNS). What seem to fix the issue temporarily is to make a change to the network adapter of the client machine and then revert the change. Then it works for a bit but will always fail again.

System Info

Internal DNS

Distributor ID: Debian
Description:    Debian GNU/Linux 6.0.6 (squeeze)
Release:        6.0.6
Codename:       squeeze

Linux 2.6.32-5-686 i686

BIND 9.7.3

PC

OS Name:                   Microsoft Windows 7 Professional
OS Version:                6.1.7601 Service Pack 1 Build 7601
System Type:               x64-based PC

Network Card(s):           2 NIC(s) Installed.
                           [01]: Realtek PCIe GBE Family Controller
                                 Connection Name: WORK LAN
                                 DHCP Enabled:    No
                                 IP address(es)
                                 [01]: the.ipv4.address
                                 [02]: the:ipv6:address

Result of dig +trace

; <<>> DiG 9.3.2 <<>> +trace
;; global options:  printcmd
.                       49341   IN      NS      h.root-servers.net.
.                       49341   IN      NS      k.root-servers.net.
.                       49341   IN      NS      i.root-servers.net.
.                       49341   IN      NS      g.root-servers.net.
.                       49341   IN      NS      a.root-servers.net.
.                       49341   IN      NS      e.root-servers.net.
.                       49341   IN      NS      f.root-servers.net.
.                       49341   IN      NS      d.root-servers.net.
.                       49341   IN      NS      j.root-servers.net.
.                       49341   IN      NS      c.root-servers.net.
.                       49341   IN      NS      b.root-servers.net.
.                       49341   IN      NS      l.root-servers.net.
.                       49341   IN      NS      m.root-servers.net.
;; Received 244 bytes from 192.168.201.223#53(192.168.201.223) in 3 ms

.                       518400  IN      NS      a.root-servers.net.
.                       518400  IN      NS      b.root-servers.net.
.                       518400  IN      NS      c.root-servers.net.
.                       518400  IN      NS      d.root-servers.net.
.                       518400  IN      NS      e.root-servers.net.
.                       518400  IN      NS      f.root-servers.net.
.                       518400  IN      NS      g.root-servers.net.
.                       518400  IN      NS      h.root-servers.net.
.                       518400  IN      NS      i.root-servers.net.
.                       518400  IN      NS      j.root-servers.net.
.                       518400  IN      NS      k.root-servers.net.
.                       518400  IN      NS      l.root-servers.net.
.                       518400  IN      NS      m.root-servers.net.
;; Received 492 bytes from 128.63.2.53#53(h.root-servers.net) in 478 ms

System log during bind9 restart

root@DET4A:~# tail -f /var/log/syslog
Oct 22 14:51:49 DET4A named[17248]: zone 255.in-addr.arpa/IN: loaded serial 1
Oct 22 14:51:49 DET4A named[17248]: /etc/bind/zones/dsasystems.inc.db:12: ignoring out-of-zone data (ns1.ipower.com)
Oct 22 14:51:49 DET4A named[17248]: /etc/bind/zones/dsasystems.inc.db:13: ignoring out-of-zone data (ns2.ipower.com)
Oct 22 14:51:49 DET4A named[17248]: zone example.inc/IN: loaded serial 2006081401
Oct 22 14:51:49 DET4A named[17248]: zone localhost/IN: loaded serial 2
Oct 22 14:51:49 DET4A named[17248]: managed-keys-zone ./IN: loading from master file managed-keys.bind failed: file not found
Oct 22 14:51:49 DET4A named[17248]: managed-keys-zone ./IN: loaded serial 0
Oct 22 14:51:49 DET4A named[17248]: zone example.inc/IN: sending notifies (serial 2006081401)
Oct 22 14:51:49 DET4A named[17248]: zone 201.168.192.in-addr.arpa/IN: sending notifies (serial 2006081401)
Oct 22 14:51:49 DET4A named[17248]: running
Oct 22 14:56:51 DET4A named[17248]: received control channel command 'stop -p'
Oct 22 14:56:51 DET4A named[17248]: shutting down: flushing changes
Oct 22 14:56:51 DET4A named[17248]: stopping command channel on 127.0.0.1#953
Oct 22 14:56:51 DET4A named[17248]: stopping command channel on ::1#953
Oct 22 14:56:51 DET4A named[17248]: no longer listening on ::#53
Oct 22 14:56:51 DET4A named[17248]: no longer listening on 127.0.0.1#53
Oct 22 14:56:51 DET4A named[17248]: no longer listening on 192.168.201.223#53
Oct 22 14:56:51 DET4A named[17248]: exiting
Oct 22 14:56:52 DET4A named[17303]: starting BIND 9.7.3 -u bind
Oct 22 14:56:52 DET4A named[17303]: built with '--prefix=/usr' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--sysconfdir=/etc/bind' '--localstatedir=/var' '--enable-threads' '--enable-largefile' '--with-libtool' '--enable-shared' '--enable-static' '--with-openssl=/usr' '--with-gssapi=/usr' '--with-gnu-ld' '--with-dlz-postgres=no' '--with-dlz-mysql=no' '--with-dlz-bdb=yes' '--with-dlz-filesystem=yes' '--with-dlz-ldap=yes' '--with-dlz-stub=yes' '--with-geoip=/usr' '--enable-ipv6' 'CFLAGS=-fno-strict-aliasing -DDIG_SIGCHASE -O2' 'LDFLAGS=' 'CPPFLAGS='
Oct 22 14:56:52 DET4A named[17303]: adjusted limit on open files from 1024 to 1048576
Oct 22 14:56:52 DET4A named[17303]: found 2 CPUs, using 2 worker threads
Oct 22 14:56:52 DET4A named[17303]: using up to 4096 sockets
Oct 22 14:56:52 DET4A named[17303]: loading configuration from '/etc/bind/named.conf'
Oct 22 14:56:52 DET4A named[17303]: reading built-in trusted keys from file '/etc/bind/bind.keys'
Oct 22 14:56:52 DET4A named[17303]: using default UDP/IPv4 port range: [1024, 65535]
Oct 22 14:56:52 DET4A named[17303]: using default UDP/IPv6 port range: [1024, 65535]
Oct 22 14:56:52 DET4A named[17303]: listening on IPv6 interfaces, port 53
Oct 22 14:56:52 DET4A named[17303]: listening on IPv4 interface lo, 127.0.0.1#53
Oct 22 14:56:52 DET4A named[17303]: listening on IPv4 interface eth0, 192.168.201.223#53
Oct 22 14:56:52 DET4A named[17303]: generating session key for dynamic DNS
Oct 22 14:56:52 DET4A named[17303]: set up managed keys zone for view _default, file 'managed-keys.bind'
Oct 22 14:56:52 DET4A named[17303]: automatic empty zone: 254.169.IN-ADDR.ARPA
Oct 22 14:56:52 DET4A named[17303]: automatic empty zone: 2.0.192.IN-ADDR.ARPA
Oct 22 14:56:52 DET4A named[17303]: automatic empty zone: 100.51.198.IN-ADDR.ARPA
Oct 22 14:56:52 DET4A named[17303]: automatic empty zone: 113.0.203.IN-ADDR.ARPA
Oct 22 14:56:52 DET4A named[17303]: automatic empty zone: 255.255.255.255.IN-ADDR.ARPA
Oct 22 14:56:52 DET4A named[17303]: automatic empty zone: 0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.IP6.ARPA
Oct 22 14:56:52 DET4A named[17303]: automatic empty zone: 1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.IP6.ARPA
Oct 22 14:56:52 DET4A named[17303]: automatic empty zone: D.F.IP6.ARPA
Oct 22 14:56:52 DET4A named[17303]: automatic empty zone: 8.E.F.IP6.ARPA
Oct 22 14:56:52 DET4A named[17303]: automatic empty zone: 9.E.F.IP6.ARPA
Oct 22 14:56:52 DET4A named[17303]: automatic empty zone: A.E.F.IP6.ARPA
Oct 22 14:56:52 DET4A named[17303]: automatic empty zone: B.E.F.IP6.ARPA
Oct 22 14:56:52 DET4A named[17303]: automatic empty zone: 8.B.D.0.1.0.0.2.IP6.ARPA
Oct 22 14:56:52 DET4A named[17303]: command channel listening on 127.0.0.1#953
Oct 22 14:56:52 DET4A named[17303]: command channel listening on ::1#953
Oct 22 14:56:52 DET4A named[17303]: the working directory is not writable
Oct 22 14:56:52 DET4A named[17303]: zone 0.in-addr.arpa/IN: loaded serial 1
Oct 22 14:56:52 DET4A named[17303]: zone 127.in-addr.arpa/IN: loaded serial 1
Oct 22 14:56:52 DET4A named[17303]: zone 201.168.192.in-addr.arpa/IN: loaded serial 2006081401
Oct 22 14:56:52 DET4A named[17303]: zone 255.in-addr.arpa/IN: loaded serial 1
Oct 22 14:56:52 DET4A named[17303]: /etc/bind/zones/dsasystems.inc.db:12: ignoring out-of-zone data (ns1.ipower.com)
Oct 22 14:56:52 DET4A named[17303]: /etc/bind/zones/dsasystems.inc.db:13: ignoring out-of-zone data (ns2.ipower.com)
Oct 22 14:56:52 DET4A named[17303]: zone dsasystems.inc/IN: loaded serial 2006081401
Oct 22 14:56:52 DET4A named[17303]: zone localhost/IN: loaded serial 2
Oct 22 14:56:52 DET4A named[17303]: managed-keys-zone ./IN: loading from master file managed-keys.bind failed: file not found
Oct 22 14:56:52 DET4A named[17303]: managed-keys-zone ./IN: loaded serial 0
Oct 22 14:56:52 DET4A named[17303]: zone dsasystems.inc/IN: sending notifies (serial 2006081401)
Oct 22 14:56:52 DET4A named[17303]: running
Oct 22 14:56:52 DET4A named[17303]: zone 201.168.192.in-addr.arpa/IN: sending notifies (serial 2006081401)

resolve.conf on DNS

search example.inc
nameserver 209.253.113.18 //This is the IP of the external DNS provider

To be honest, with regards to the resolve.conf file, I am not sure what sort of role it plays on the DNS side.

Best Answer

Big thanks to smithian for ultimately providing the answer to this one.

The issue seems to be caused by the DNS server priorities not being cleared all the time. It seems that sometimes the preferred DNS is not used and thus the link can't be resolved.

This link on Microsoft Support Site details the issue and provides the solution also.

The Fix

  1. Open registry editor in windows - enter regedit in search window under start menu'=
  2. Navigate to HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Dnscache\Parameters
  3. Add a new REG_DWORD called ServerPriorityTimeLimit and assign the value 0

This will ensure that the DNS server priorities are reset before deciding what DNS to use.