Background:
I have an environment with two different AD domains, each in its own forest, each with two Windows Server 2008 R2 domain controllers acting as DNS servers. There is no trust between the domains.
Each DNS server manages the main DNS zone for its AD domain, and then some other zones, including the reverse lookup zone for its IP subnets; all zones are AD-integrated; all DNS servers which manages a zone are correctly listed as authoritative name servers for that zone.
So, the situation is like this (using fake names and IP addresses):
Domain A:
DNS domain: domainA.dom
IP subnet: 192.168.1
DCs/DNS Servers: serverA1.domainA.dom
(192.168.1.1
), serverA2.domainA.dom
(192.168.1.2
)
Authoritative zones: domainA.dom
, 1.168.192.in-addr.arpa
, somezone.local
Domain B:
DNS domain: domainB.dom
IP subnet: 10.0.0
DCs/DNS Servers: serverB1.domainB.dom
(10.0.0.1
), serverB2.domainB.dom
(10.0.0.2
)
Authoritative zones: domainB.dom
, 0.0.10.in-addr.arpa
, someotherzone.local
DNS servers in domain A have conditional forwarders defined for each zone managed by DNS servers in domain B, forwarding to both domain B's DNS servers; DNS servers in domain B have the opposite configuration. All forwarders are stored in Active Directory.
All is working perfectly, and computers in each domain can resolve forward and reverse DNS queries for both domains, using their domain's DNS servers.
The problem:
I have SCOM 2012 deployed in domain A, with the SCOM agent installed on both DCs; the management packs for Active Directory and DNS Server are installed and up-to-date.
I have a series of alerts like the following ones on both domain controllers; each alert is generated for each forwarded zone and for each forwarded server:
Forwarder someotherzone.local (10.0.0.1) cannot resolve the host name 192.168.1.1,someotherzone.local for serverA1.domainA.dom
Forwarder someotherzone.local (10.0.0.2) cannot resolve the host name 192.168.1.1,someotherzone.local for serverA1.domainA.dom
Forwarder someotherzone.local (10.0.0.1) cannot resolve the host name 192.168.1.2,someotherzone.local for serverA2.domainA.dom
Forwarder someotherzone.local (10.0.0.2) cannot resolve the host name 192.168.1.2,someotherzone.local for serverA2.domainA.dom
Forwarder 0.0.10.in-addr.arpa (10.0.0.1) cannot resolve the host name 192.168.1.1,0.0.10.in-addr.arpa for serverA1.domainA.dom
Forwarder 0.0.10.in-addr.arpa (10.0.0.2) cannot resolve the host name 192.168.1.1,0.0.10.in-addr.arpa for serverA1.domainA.dom
Forwarder 0.0.10.in-addr.arpa (10.0.0.1) cannot resolve the host name 192.168.1.2,0.0.10.in-addr.arpa for serverA2.domainA.dom
Forwarder 0.0.10.in-addr.arpa (10.0.0.2) cannot resolve the host name 192.168.1.2,0.0.10.in-addr.arpa for serverA2.domainA.dom
The only exception is the main AD DNS zone managed by domain B's DNS servers ("domainB.dom"): for that conditional forwarder, no alert is generated and the forwarder availability monitor is green.
Ok, what does this mean?
What are those monitors trying to tell me?
What are they checking?
What's actually wrong?
And why there is no error for the "domainB.dom" zone, which is configured in the exact same way as the other ones, both as a zone in domain B's DNS servers and as a forwarder in domain A's DNS servers?
Best Answer
Answer found, and it's a bit unpleasant (at least if you were expecting the people creating management packs to actually know what they are doing).
Extracted from the monitor's description:
What this means is that the SCOM agent will try to perform queries like these ones:
This would work for the main AD domain's DNS zone (because DCs automatically register themselves as empty
A
records for that zone), but would fail for "someotherzone.local", which didn't have any emptyA
record (I can confirm that, after manually creating one, the alert disappeared and the monitor returned to green).The third query, of course, would always fail, because it just doesn't make any sense at all to look for an
A
record in a reverse lookup zone.Resolution: override the DNS Forwarder Availability Monitor to perform a
NS
orSOA
query for the forwarded zone instead of anA
query.Which is what it should have been doing from the very beginning.