If the name server is only authoritative (i.e. it's not also providing recursive service for your network), simply remove the "root hints" section from /etc/named.conf
.
This typically looks something like this:
zone "." IN {
type hint;
file "named.ca";
};
Authoritative servers don't need this zone.
Doing this should result in the server returning REFUSED
rather than a copy of the root name servers to external clients.
Also, as your server is authoritative only you should add:
recursion no;
in the main configuration block.
The whole thing kind of reeks of a "not my problem" scenario that's not really your fault and should/could be 100% resolved by taking the appropriate action, regardless of how "difficult" or "hard" it is, and that's terminating your open recursive server.
Phase it out: tell the customers that this server is going away as of X date. After that time, they need to install a patch (assuming you have one) to stop it from using your DNS server. This is done all the time. Sysadmins, network admins, helpdesk guys, programmers? We get it; this end-of-life thing happens all the time, because its standard operating procedure for a vendor/service provider/partner to tell us to stop using something after X date. We don't always like it, but its a fact of life in IT.
You say you don't have this issue on the current devices, so I'm assuming you've resolved this issue with a firmware update or patch. I know you said you can't touch the device, but surely they can? I mean, if they're allowing these boxes to essentially phone home to you, they can't really be that anal about who's doing what to their devices; you could have a reverse proxy setup for all they know, so why not have them install a patch that fixes this or tell them to use their own DNS servers. Surely your device supports DHCP; I can't think of a network device (not matter how old/frail/odd) that doesn't.
If you can't do that, the next thing to do is control who can access your recursive server: you say that it's "hard to tell" who's using it and how, but it's time to find out for certain and start dropping traffic that's not legitimate.
These are "quasi-military/government" organizations, right? Well, they likely are part of a legitimate netblock that they own; these devices aren't home routers behind dynamic IPs. Find out. Contact them, explain the problem and how you are saving them a lot of money by not forcing a firmware or product replacement if only they can confirm the netblock/IP address that the device will be using to access your DNS server.
This is done all the time: I have several customers who restrict extranet access or HL7 listeners to healthcare partners in this way; it's not that hard to get them to fill out a form and provide the IP and/or netblock I should be expecting traffic from: if they want access to the extranet, they have to give me an IP or subnet. And this is rarely a moving target so it's not like you're going to get inundated with hundreds of IP change requests every day: big campus hospital networks that own their own netblocks with hundreds of subnets and thousands and thousands of host IPs routinely give me a handful of IP addresses or a subnet I should be expecting; again, these aren't laptop users wandering all around campus all the time, so why would I expect to see UDP source packets from an ever-changing IP address? Clearly I'm making I'm an assumption here, but I'll bet it's not as much as you think for < 100s of devices. Yes, it'll be a lengthy ACL, and yes, it requires some maintenance and communication (gasp!) but its the next best thing outside of shutting it down completely.
If for some reason the channels of communication are not open (or somebody's too afraid or can't be bothered to contact these legacy device owners and do this properly), you need to to establish a baseline of normal usage/activity so you can formulate some other strategy that will help (but not prevent) your participation in DNS amplification attacks.
A long-running tcpdump
should work filtering on incoming UDP 53 and verbose logging on the DNS server application. I would also want to start collecting source IP addresses/netblocks/geoIP information (are all your clients in the US? Block everything else) because, as you say, you're not adding any new devices, you're merely providing a legacy service to existing installations.
This will also help you understand what record types are being requested, and for what domains, by whom, and how often: for DNS amplification to work as intended, the attacker needs to be able to request a large record type (1) to a functioning domain (2).
"large record type": do your devices even need TXT or SOA records to be able to be resolved by your recursive DNS server? You may be able to specify which record types are valid on your DNS server; I believe it's possible with BIND and perhaps Windows DNS, but you'd have to do some digging. If your DNS server responds with SERVFAIL
to any TXT or SOA records, and least that response is an order of magnitude (or two) smaller than the payload that was intended. Obviously you're still "part of the problem" because the spoofed victim would still be getting those SERVFAIL
responses from your server, but at least you're not hammering them and perhaps your DNS server gets "delisted" from the harvested list(s) the bots use over time for not "cooperating".
"functioning domain": you may be able to whitelist only domains that are valid. I do this on my hardened data center setups where the server(s) only need Windows Update, Symantec, etc. to function. However, you're just mitigating the damage you're causing at this point: the victim would still get bombarded with NXDOMAIN
or SERVFAIL
responses from your server because your server would still respond to the forged source IP. Again, Bot script might also automatically update it's open server list based on results, so this could get your server removed.
I'd also use some form of rate limiting, as others have suggested, either at the application level (i.e. message size, requests per client limitations) or the firewall level (see the other answers), but again, you're going to have to do some analysis to ensure you're not killing legitimate traffic.
An Intrusion Detection System that's been tuned and/or trained (again, need a baseline here) should be able to detect abnormal traffic over time by source or volume as well, but would likely take regular babysitting/tuning/monitoring to prevent false positives and/or see if it's actually preventing attacks.
At the end of the day, you have to wonder if all this effort is worth it or if you should just insist that the right thing is done and that's eliminating the problem in the first place.
Best Answer
With the remote packets having a source port of 53, one of four things is usually the case:
Lots of people mistake #1 to be an attack on their server (#3). You have to look at the amount of incoming traffic to gauge, and since you aren't complaining about your bandwidth being choked by this it's unlikely. Let's rule #3 out.
Our next hint is the query name:
tnczmz.x99moyu.net
. Names like this are familiar to anyone who has operated high volume recursive DNS servers in the last few years (and have been paying attention): this is a "Pseudo Random Subdomain" aka "Water Torture" attack. I won't get into exhaustive detail here, but the idea is to generate thousands of uncacheable random queries under a domain so that all the requests are sent up to the nameservers for that domain, the true victim of the attack. In this case they're the nameservers for Cloudflare:Since I'm pretty sure you aren't a server admin for Cloudflare, we now need to consider the direction of the packets in order to rule out #1. Here's what we have to work with:
A Chinese IP is sending you DNS reply packets. We know they're reply packets because they contain a DNS answer section. Since neither you or this remote IP are owned by Cloudflare (their nameservers manage the domain), we can assume that one of these IPs is spoofed. Given the victim of the attack (Cloudflare) and how the attack operates, the address most likely being spoofed here is yours. That would make the Chinese IP a server that is receiving the recursive queries.
My conclusion is that you are neither the target of the attack or participating in the attack (which is the case more often than not, you got lucky). This is a case of #4: your source IP is just getting spoofed as a part of someone else covering their trail while performing an attack against Cloudflare.
In hindsight, #2 is still a possibility. Even if the server holding your IP does not provide a DNS listener, your server could be compromised and running malware that is generating the queries. This assumes that your packet capture is overlooking the outbound queries, of course. (it dawned on me that it's bad to assume that this would have been noticed)