Ubuntu – SSSD fails service discovery for AD Global Catalog despite explicitly defined URIs


I'm running into problems with SSSD and Active Directory integration. My AD is setup with a setup of 9 Domain Controllers, some of which are firewalled and inaccessible for various security reasons. Therefore, service discovery won't work with SSSD. This should be fine, as I should be able to explicitly define the URIs as needed.

I have the relevant sections of my sssd.conf that define these URIs looks like this:

ldap_uri = ldap://myserver.domain
ldap_backup_uri = ldap://backup-server.domain
krb5_server = myserver.domain

However, when trying to test this configuration with a getent group mygroup@domain.com, I see these messages at the start of the activity related to that lookup in the sssd domain logs (debuglevel = 9):

(Tue Nov  7 17:20:26 2017) [sssd[be[mydomain]]] [sdap_id_op_connect_step] (0x4000): beginning to connect
(Tue Nov  7 17:20:26 2017) [sssd[be[mydomain]]] [fo_resolve_service_send] (0x0100): Trying to resolve service 'AD_GC'
(Tue Nov  7 17:20:26 2017) [sssd[be[mydomain]]] [ad_get_dc_servers_send] (0x0400): Looking up domain controllers in domain mydomain
(Tue Nov  7 17:20:26 2017) [sssd[be[mydomain]]] [resolv_discover_srv_next_domain] (0x0400): SRV resolution of service 'ldap'. Will use DNS discovery domain 'mydomain'

It eventually finds a DC that is inaccessible, and fails the service resolution:

(Tue Nov  7 17:20:26 2017) [sssd[be[mydomain]]] [sdap_connect_host_resolv_done] (0x0400): Connecting to ldap://inaccessible-host.mydomain:389
(Tue Nov  7 17:20:56 2017) [sssd[be[mydomain]]] [fo_resolve_service_timeout] (0x0080): Service resolving timeout reached
(Tue Nov  7 17:20:56 2017) [sssd[be[mydomain]]] [sss_ldap_init_state_destructor] (0x0400): closing socket [26]
(Tue Nov  7 17:20:56 2017) [sssd[be[mydomain]]] [sdap_handle_release] (0x2000): Trace: sh[0xc861f0], connected[0], ops[(nil)], ldap[(nil)], destructor_lock[0], release_memory[0]
(Tue Nov  7 17:20:56 2017) [sssd[be[mydomain]]] [be_resolve_server_done] (0x1000): Server resolution failed: 14
(Tue Nov  7 17:20:56 2017) [sssd[be[mydomain]]] [sdap_id_op_connect_done] (0x0400): Failed to connect to server, but ignore mark offline is enabled.
(Tue Nov  7 17:20:56 2017) [sssd[be[mydomain]]] [sdap_id_op_connect_done] (0x4000): notify error to op #1: 5 [Input/output error]
(Tue Nov  7 17:20:56 2017) [sssd[be[mydomain]]] [acctinfo_callback] (0x0100): Request processed. Returned 3,5,Group lookup failed

I've scoured the man pages for an option to either disable service discovery (it seems should be accomplished by specifying the URIs I have defined in my config), or some option to control the behavior of this 'AD_GC' service (which I've inferred is the AD Global Catalog service from this article: https://community.hortonworks.com/articles/92314/sssd-1.html).

I'm at a loss, however, as to how to control this behavior and disable it or to point it to the appropriate server.

I'm running SSSD version 1.13.4 on ubuntu 16.04.3 if that helps at all.

Best Answer

I don't know about SSSD - however I do know the relationship between AD and DNS. This line SRV resolution of service 'ldap' combined with the fact that you have purposely isolated Domain Controllers, leads me to believe that it's erroring out once it tries to query an isolated DC from the list of LDAP SRV records in the domain. If that's the case, prevent those DCs from registering their SRV records in DNS.

Value name: DnsAvoidRegisterRecords
Data type: REG_MULTI_SZ 
Data value: Ldap