Created attachment 10120 [details]
excerpt from testparm output
in our new Samba 3.6.23 and also 3.6.24 installation smb-connections sometimes panic and finally can sum up and stop the whole server from working. Not reproducable, happens about every 2 or 3 days. I'm not absolutely sure about that, but it seems to happen more likely on user logins (and under higher load, timing problems?).
We never had this problem before under previous versions.
The backend is simple ldap, no winbind used.
--enable-socket-wrapper --enable-cups --enable-nss-wrapper --with-ldap --with-acl-support --without-ads --enable-pthreadpool --enable-debug --without-wbclient --without-winbind
Attached is an excerpt from testparm and two backtraces, which both start in file util_pw.c:82.
Created attachment 10121 [details]
Created attachment 10122 [details]
Stared at the code closely. I don't see how this can happen. Do you have any further hints towards a reproducer? What were the users doing? Is the user with rid 1108 in any way special in LDAP?
(In reply to comment #3)
> Stared at the code closely. I don't see how this can happen. Do you have any
> further hints towards a reproducer? What were the users doing? Is the user with
> rid 1108 in any way special in LDAP?
I cannot find anything special about the users in ldap (all IDs are unique). But I realized, as can be seen from the logs-excerpt "samba-panictimes.txt" which I will attach here, that these panics happen either several times in a rather short amount of time (within one or several minutes like in "log.b020.old") or they happen regularly every one or every two(!) hours (as in "/var/log/samba/log.a174") - but I couldn't find a corresponding regular entry in the windows clients logs.
I get the "ldapsam_getsampwsid: Unable to locate SID..." message (also in pdbedit calls), but I don't think it is important. I will also attach a typical stacktrace written to a logfile.
Question that comes to my mind: Could it be a problem with parallel logons of one user on several machines (/different threads)? plus timing problem? (hard to trace)
Created attachment 10131 [details]
list of times when connections panicked
Created attachment 10132 [details]
small log excerpt with panic stacktrace
I also have a lot of broken nmb-processes whenever the panics happen (last time about 600!).
And in the newest release notes (for Samba 4.1.11 and 4.0.21) there was a corrected bug 10735. I grepped for the mentioned function "unstrcpy" in the source3-directory, and it was used (only) in files named nmbd_* within the directory nmbd. Of course it is a bit flat idea, but could there be any connection between this bug and the broken nbm-processes (and the connection panics)?