Caching DNS server (bind9.2) CPU usage is so so so high

bindcachecpu-usagedomain-name-system

I have a caching-only dns server which get ~3k queries per second. Here is specs:

Xeon dual-core 2,8GHz 4GB of RAM
Centos 5x (kernel 2.6.18-164.15.1.el5PAE) 
bind 9.4.2

rndc status:
recursive clients: 666/4900/5000

About 300 new queries (not in cache) per second.

Bind always uses 100% on one core on single-thread config. After I recompiled it to multi-thread, it uses nearly 200% on two core 🙁 No iowait, only sys and user. I searched around but didn't see any info about how bind use CPU. Why does it become bottleneck?

One more thing, here is RAM usage:

cat /proc/meminfo 
MemTotal:      4147876 kB
MemFree:       1863972 kB
Buffers:        143632 kB
Cached:         372792 kB
SwapCached:          0 kB
Active:        1916804 kB
Inactive:       276056 kB

I've set max-cache-size to 0 to make sure bind can use as much RAM as it want, but it always stop at ~2GB. Since every second we got not cached queries so theoretically RAM must be exhausted but it wasn't.

Do you have any idea?

TIA,

-Gk

Best Answer

Which version of BIND are you using? Versions before Bind 9.5 have known scalability problems with high loads, see https://www.dns-oarc.net/files/dnsops-2007/Graff-BIND9-cache.pdf .

Besides:

  • never set max-cache-size to 0 unless you want to open your server to DoS
  • the maximum size taken by your cache is always bound to the TTLs of the actual records

I recommend you perform a side test with dnscache from dnscache, it takes 10 minutes to install, is extremely simple to tune and maintain, and has predictable performance.