The Samba-Bugzilla – Bug 7452
Winbind children not dying after service stop
Last modified: 2014-07-24 23:16:05 UTC
Something quite weird had me up late yesterday night, which I resolved today, which may be connected to a bug:
I was trying to get idmap to work, and kept getting "*" as domain in log.winbindd-idmap. (This behaviour can be found several times i the mailing list and has been fixed with at least two different modifications.)
Searching resulted in using idmap config in smb.conf as well as deleting gencache.tdb - but to no avail.
It seems this was due to leftover winbindd processes.
This could be related to the nss_wins bug, which I previously encountered.
Notably these winbindd processes would lag a moment in a ps aux, before appearing on-screen. Killing them was possible without using the -9 force switch though.
Sadly I am at the moment not able to produce exact reproduction instructions/conditions.
It might be helpful for future releases to wipe caches in case they appear to be inconsistent, and testparm should be somewhat more critical of idmap config options. Also the case of the "*" domain should be thrown as an error right away...
Sorry, but I don't quite understand what the bug is.
What was the problem? Invalid idmap config or leftover processes?
or a combination of both?
Could you please add your smb.conf file and the log file
that contains those idmap messages?
Cheers - Michael
(In reply to comment #1)
> Sorry, but I don't quite understand what the bug is.
> What was the problem? Invalid idmap config or leftover processes?
> or a combination of both?
There was no problem in smb.conf, idmap config worked without any changes, after I killed the leftover processes. (unless the previous winbindd's were still runnig on the old idmap smb.conf syntax - quite possibly the case)
> Could you please add your smb.conf file and the log file
> that contains those idmap messages?
> Cheers - Michael
smb.conf is a pretty bog-standard domain-client/idmap enabled setup:
passdb backend = tdbsam
idmap config RICKNET : backend = ad
idmap config RICKNET : readonly = yes
idmap config RICKNET : schema_mode = rfc2307
idmap config RICKNET : range = 10000-20000
idmap uid = 10000-20000
idmap gid = 10000-20000
winbind nss info = rfc2307
winbind enum users = yes
winbind enum groups = yes
is the relecant section.
But, I don't believe the error to lie there at all.
I don't think I have the logs anymore, but the error that was reported (aside of occasionally a small note saying "winbindd is already running") told me that no connection to the domain "*" could be established.
For some reason the old winbindd children had the domain set to *, which is clearly wrong.
I don't think the bug itself is so bad - but the behaviour of winbind should be somewhat more informative. The case of the asterisk domain should throw a hard error. Also, new instances should fail more obviously if old instances are detected.
The idmap configuration issue which I adressed is regarding the new "idmap config xxxx" syntax, which testparm apparently doesn't enforce. As this was another source of the asterisk-domain error (according to the mailing list), checks should be added.
So in other words: The bug is the lack of urgency from winbind to complain about some errors. The message that another instance was running was in a logfile that gave a lot of otherwise nonproblematic info (log.winbind), and the errors were in another logfile (-idmap), that didn't point in any way to the source of the problem. And all the while the winbbind service would start and not obviously fail in any way, except that idmap was not working.
I hope this is a little more clear now.
i have nwver seen this. as u talk about kill -9 maybe u dud that to the parent of those childs. anyway if u reencounter the problem and when u have log files, feel free to reopen the bug. for now we close it. thanks for reporting ...