Systems are migrating to systemd-resolved.
systemd-resolved is able to resolve the local hostname (including fqdn) and localhost, so it will be possible to have an empty /etc/hosts file.
systemd-resolved is dbus based, so getaddrinfo calls will end up in dbus calls.
dbus in turn will call nss_winbind if the connection uid is unknown (dbus auth EXTERNAL).
On ubuntu 18.04 I discovered a case when this happens:
1) /etc/hosts is empty
2) /etc/nsswitch.conf is configured like:
passwd: compat winbind
group: compat winbind
hosts: files resolve
3) smb.conf is configured with
kerberos method = secrets and keytab
winbind offline logon = Yes
3) pam_winbind is configured with krb5_auth
To me it seems the following happens:
1. pam_winbind connects to winbindd and sends user/password login request
2. winbind checks credentials and setuid's to the users id used for creating the kerberos cache
3. fill_mem_keytab_from_system_keytab will call name_to_fqdn
4. name_to_fqdn will call getaddrinfo
5. nss_resolve will connect to dbus
6. dbus wants to know who this uid belongs to
7. since compat (/etc/passwd) does not contain a matching entry, dbus connects to winbind
8. somehow in this situation winbind is unable to reply to dbus -> deadlock.
9. as soon as pam_winbind gives up, the deadlock seems to get released. dbus gets an answer and caches it (the next time the deadlock will not happen).
The real problem is that it is higly unsafe to call dbus in winbind as long there are such locks..
As far as I checked this is the only place where getaddrinfo is called when getuid()!=0.
Since it makes no sense to call fill_mem_keytab_from_system_keytab if getuid()!=0 (/etc/krb5.keytab should be 600), I checked this in a test build and until now I have not had any deadlock.