Bug 9827 - smbd segfault randomly while client login.
Summary: smbd segfault randomly while client login.
Status: RESOLVED WORKSFORME
Alias: None
Product: Samba 3.6
Classification: Unclassified
Component: User & Group Accounts (show other bugs)
Version: 3.6.6
Hardware: x64 Linux
: P5 normal
Target Milestone: ---
Assignee: Samba Bugzilla Account
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-04-24 10:47 UTC by Serge Coulon
Modified: 2022-01-05 17:00 UTC (History)
3 users (show)

See Also:


Attachments
gdb b (121.86 KB, application/octet-stream)
2013-04-24 10:47 UTC, Serge Coulon
no flags Details
smbd backtrace for client pccom1 (7.85 KB, text/plain)
2013-04-29 09:24 UTC, Serge Coulon
no flags Details
smbd log level 10 for client pccom1 (900.02 KB, text/plain)
2013-04-29 09:25 UTC, Serge Coulon
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Serge Coulon 2013-04-24 10:47:08 UTC
Created attachment 8810 [details]
gdb b

Hello,

I'm using Samba as a PDC with an LDAP backend:
- Debian Wheezy
- Samba 3.6.6
- OpenLDAP 2.4.23

From time to time, when a client log into the domain, i get a segfault (see attachments).
The only thing i can do is stopping samba, killing remaining smbd processes and restarting Samba service, then the client can log in again.

I can confirm i have no dups in my LDAP tree (first guess).
I get the segfaults with WinXP and Seven clients.

The segfault always accur in the 'tcopy_passwd' function (see backtrace).

I can't really reproduce this behavior (except that it is often the first client to connect in the morning that crash the server)

Would the glad to get some pointers to how i can debug theses crashes farther.

The concerned Host account configurations for this report are:
#getent group
Domain Computers:*:515:

# getent passwd
pccom3$:*:1075:515:Computer:/nonexistent:/bin/false

# ldap entry for host pccom3
uid=pccom3$,ou=machines,dc=webdealauto,dc=com
1838 objectClass: top
1839 objectClass: account
1840 objectClass: posixAccount
1841 objectClass: sambaSamAccount
1842 cn: pccom3$
1843 uid: pccom3$
1844 uidNumber: 1075 
1845 gidNumber: 515
1846 homeDirectory: /nonexistent
1847 loginShell: /bin/false
1848 description: Computer
1849 gecos: Computer
1850 sambaSID: S-1-5-21-2380245508-1587309507-2390072590-1017
1851 displayName: PCCOM3$
1852 sambaAcctFlags: [W          ]    
1853 sambaNTPassword: E2327DC351F8151CC2D127A89490D701
1854 sambaPwdLastSet: 1366369777


Thanks, regards,

Serge.
Comment 1 Volker Lendecke 2013-04-25 07:13:42 UTC
Looked at the backtrace and the code -- did not see anything special. Can you run with debug level 10 for this client for a while and send us the debug level 10 output of smbd leading to such a crash? See

https://wiki.samba.org/index.php/Client_specific_Log

for doing a per-client log. max log size = 100000 should be way sufficient for this.

Thanks,

Volker
Comment 2 Serge Coulon 2013-04-29 09:24:41 UTC
Created attachment 8826 [details]
smbd backtrace for client pccom1
Comment 3 Serge Coulon 2013-04-29 09:25:41 UTC
Created attachment 8827 [details]
smbd log level 10 for client pccom1
Comment 4 Serge Coulon 2013-04-29 09:30:13 UTC
Attached another backtrace and an smbd log for particuliar client (pccom1, Windows7),

LDAP entry for pccom1:

1449 71 uid=pccom1$,ou=machines,dc=webdealauto,dc=com
1450 objectClass: top
1451 objectClass: account
1452 objectClass: posixAccount
1453 objectClass: sambaSamAccount
1454 cn: pccom1$
1455 uid: pccom1$
1456 uidNumber: 1015
1457 gidNumber: 515
1458 homeDirectory: /dev/null
1459 loginShell: /bin/false
1460 description: Computer
1461 gecos: Computer
1462 sambaSID: S-1-5-21-2380245508-1587309507-2390072590-1011
1463 displayName: PCCOM1$
1464 sambaAcctFlags: [W          ]
1465 sambaNTPassword: 58F3E04BDB8BBDB231F45CF9F7BE6C01
1466 sambaPwdLastSet: 1357301900

Same behavior as before, plus a funny '*** stack smashing detected ***'

Don't know where to go now...


Any idea ?


Thanks !

Regards,

Serge.
Comment 5 Alistair Leslie-Hughes 2013-09-24 09:34:40 UTC
Is this still an issue with the latest 3.6 version?
Comment 6 Alistair Leslie-Hughes 2013-09-24 10:00:21 UTC
This looks like a duplicate of 
https://bugzilla.samba.org/show_bug.cgi?id=9686

Fixed in Samba 3.6.13 - March 18, 2013
Comment 7 Tsukasa HAMANO 2013-09-24 11:26:35 UTC
(In reply to comment #6)
> This looks like a duplicate of 
> https://bugzilla.samba.org/show_bug.cgi?id=9686
> 
> Fixed in Samba 3.6.13 - March 18, 2013

Hmm, Bug 9686 seem unrelated this bug.
Because he setup passdb backend = ldapsam:..., So mod_smbfilepwd_entry() will not called.
I have recieved most of the same core by samba 3.6.12, but unfortunately they can not try latest 3.6.
I'm requesting to reproduce by latest samba. but it will take the time.

Thank you.
Comment 8 Naoto Ishida 2013-11-19 02:37:55 UTC
Have any news on this topic?

I'm using Samba and LDAP backend, The same phenomenon occurred.
but I can't reproduce this issues.
This issues occurred in the early morning.

Samba server
- CentOS 6.4
- Samba 3.6.18 (source compiled)

Ldap server
- OpenLDAP 2.3.43-12.el5_6.7 (yum provided package)

It seems that The problem does not occurred when describing "passdb backend=tdbsam" to the smb.conf file.
Because I have been describing "passdb backend=tdbsam" to the smb.conf file, I have never seen that before.
Comment 9 Serge Coulon 2013-11-19 10:42:38 UTC
Hello,

 I've followed all Debian upgrades up to Samba 3.6.19 with no luck.
The problem still occurs:



 Core was generated by `/usr/sbin/smbd -D'.
Program terminated with signal 6, Aborted.
#0  0x00007fa8d62b01e5 in raise () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0  0x00007fa8d62b01e5 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fa8d62b3398 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fa8d9b50afb in dump_core () at lib/fault.c:391
#3  0x00007fa8d9b5f27e in smb_panic (why=why@entry=0x7fa8d9f3495e "internal error") at lib/util.c:1133
#4  0x00007fa8d9b50424 in fault_report (sig=<optimized out>) at lib/fault.c:53
#5  sig_fault (sig=<optimized out>) at lib/fault.c:76
#6  <signal handler called>
#7  tcopy_passwd (mem_ctx=mem_ctx@entry=0x7fa8dbd69280, from=0x1f0) at ../lib/util/util_pw.c:82
#8  0x00007fa8d9afc946 in pdb_copy_sam_account (dst=dst@entry=0x7fa8dbd69280, src=0x7fa8dbd6e0d0) at passdb/passdb.c:2114
#9  0x00007fa8d9affb25 in pdb_getsampwsid (sam_acct=sam_acct@entry=0x7fa8dbd69280, sid=sid@entry=0x7fff892ba8f0)
    at passdb/pdb_interface.c:414
#10 0x00007fa8d9a36fa1 in _samr_OpenUser (p=p@entry=0x7fa8dbd685b0, r=r@entry=0x7fa8dbd6f070) at rpc_server/samr/srv_samr_nt.c:2204
#11 0x00007fa8d9a47185 in api_samr_OpenUser (p=0x7fa8dbd685b0) at librpc/gen_ndr/srv_samr.c:2745
#12 0x00007fa8d9a57ad6 in rpcint_dispatch (out_data=0x7fa8dbd68e80, in_data=0x7fa8dbd68e70, opnum=<optimized out>, 
    mem_ctx=0x7fa8dbd68e70, p=0x7fa8dbd685b0) at rpc_server/rpc_ncacn_np.c:210

 Actually, i'm evaluating Samba4.


Regards, 

Serge.
Comment 10 Volker Lendecke 2013-11-19 11:14:54 UTC
Are you able to run smbd under valgrind just for a test? This looks very, very spooky to me.

A quick patch would be to disable the memcache completely, by applying this patch:

--- a/source3/smbd/globals.c
+++ b/source3/smbd/globals.c
@@ -136,7 +136,7 @@ struct memcache *smbd_memcache(void)
                 * children exiting.
                 */
                smbd_memcache_ctx = memcache_init(NULL,
-                                                 lp_max_stat_cache_size()*1024);
+                                                 1);
        }
        if (!smbd_memcache_ctx) {
                smb_panic("Could not init smbd memcache");
Comment 11 Björn Jacke 2022-01-05 17:00:14 UTC
no feedback, and very old already, closing