Linux – Getting ‘ads_connect: No logon servers’ at irregular intervals

active-directorylinuxnetworkingsambaUbuntu

I'm currently setting up a Samba 4 AD server (on Ubuntu Server 16.04) with about 10 linux/windows members in planning. After successfully provisioning the domain controller I joined the first Xubuntu 16.04 client to the domain.

At first I was able to login at the client with a samba user account. So wbinfo -u and getent passwd both listed all samba accounts. A few minutes later I tried to log in again, but the logon screen only displayed the kerberos warning, that my password is about to expire in 41 days.

getent passwd now only lists the local users. wbinfo -u is inconsistently switching between an empty list and the samba users.

net ads info -d 3 returns the following:

ads_connect: No logon servers
ads_connect: No logon servers
Didn't find the ldap server!

Deleting /var/cache/samba/gencache.tdb and /var/run/samba/gencache_notrans.tdb often changes the output to:

LDAP server: 10.230.44.1
LDAP server name: dc1.samdom.com # not the original domain
Realm: SAMDOM.COM
Bind Path: dc=SAMDOM,dc=COM
LDAP port: 389
Server time: Sa, 15 Okt 2016 18:01:33 CEST
KDC server: 10.230.44.1
Server time offset: 0

But after some time it is falling back to the output above. Sometimes simply waiting also does the trick.

I've got the same problem on a second client but not at the same time.

The server is inside a university network and also serves as a NAT router for the samba clients. However, it is possible for the clients to get internet access, if they use a non-private IP address.

smb.conf of the server:

[global] 
    workgroup = SAMDOM
    realm = SAMDOM.COM 
    netbios name = DC1
    server role = active directory domain controller 
    dns forwarder = xxx.yyy.xxx.yyy
    idmap_ldb:use rfc2307 = Yes 

    # Only listen to the internal network 
    interfaces = eno2 
    bind interfaces only = Yes 

[netlogon] 
    path = /var/lib/samba/sysvol/samdom.com/scripts 
    read only = No 

[sysvol] 
    path = /var/lib/samba/sysvol 
    read only = No

smb.conf on client:

[global] 
    netbios name = M1
    security = ADS
    workgroup = SAMDOM
    realm = SAMDOM

    log file = /var/log/samba/%m.log 
    log level = 1 

    # Default idmap config used for BUILTIN and local windows accounts/groups 
    idmap config *:backend = tdb 
    idmap config *:range = 2000-9999 

    # idmap config for domain SAMDOM 
    idmap config SAMDOM:backend = rid 
    idmap config SAMDOM:range = 10000-99999 

    # Use template settings for login shell and home directory 
    winbind nss info = template 
    template shell = /sbin/bash 
    template homedir = /home/%U 

    winbind enum users = Yes 
    winbind enum groups = Yes 
    winbind use default domain = Yes 
    encrypt passwords = Yes

The same setup without the server as NAT router but with normal IP addresses returns the same behaviour.

Best Answer

The client realm is SAMDOM instead of SAMDOM.COM.
It looks like typo in the question. Also the DC interfaces ought to include the localhost interface lo.
The net ads info output is not debug enabled with -d 3 as told.

The client and server realms have to match (and should resolve to a DNS domain). net ads info attempts to resolve DNS various domain names, including:

_ldap._tcp.Default-First-Site-Name._sites.dc._msdcs.<your realm>
_kerberos._tcp.Default-First-Site-Name._sites.dc._msdcs.<your realm>

otherwise it falls back to netbios name resolution on <your workgroup>#1c.

If I delete the caches, the <your workgroup>#1c entry is resolved with the ip of the DC valid for 660 seconds.
Else the <your workgroup>#1c value is discarded as its timeout is negative ( equal to 0 - <seconds since epoch>).
The 660 seconds timeout is the NAMECACHETIMEOUT value.

All in all this should get you a step further but not out of the No logon server error. Even if the server is resolved, the client will issue a CLDAP netlogon request which will fail.
You could check with:

ldapsearch -LLL -h <you server ip> -x -b '' -s base "(&(NtVer=\06\00\00\00)(DnsDomain=<your realm>))" NetLogon

There debug are missing to sort this out.
If it is intermittent, likely another DNS server has stale entries and at times the DC is queried for domain names , while at times the other DNS server is.

NB: <your realm>, <your workgroup> and <your server ip> are placeholders.