Good day. I get Samba3.0.6 from samba.org site (src.rpm package), before i used Samba3.0.4 After upgrade to 3.0.6 I have trouble with LOCKING database (I Think so...) There is a listing of smbstatus on my host. Look at the pid 15913... My Linux system (via ps -A) has no such pid but smbstatus shows that process 15913 exist and has open files... (see open files because top view of smbstatus has no pid 15913... ) The process 15913 was died... but smbstatus show it and user can not get full acces to his files. Short listing of SMSTATUS: Samba version 3.0.6 PID Username Group Machine ------------------------------------------------------------------- 15916 tany domusers bu2 (10.0.0.28) 16961 alex domusers zal2 (10.0.0.17) 15904 buh3 domusers zal1 (10.0.0.12) 16903 buh2 domusers zimba (10.0.0.57) 10823 elena domusers elena (10.0.0.27) 15890 vit domusers sales (10.0.0.15) 16458 serg domusers sergey (10.0.0.26) 15972 acc1 domusers wwworker (10.0.0.56) 15907 dina domusers rc (10.0.0.25) 15909 buh4 domusers buhbuh (10.0.0.11) 10823 root root elena (10.0.0.27) Service pid machine Connected at ------------------------------------------------------- docs 15904 zal1 Mon Aug 23 09:50:00 2004 alex 16961 zal2 Mon Aug 23 16:29:32 2004 docs 16458 sergey Mon Aug 23 13:10:33 2004 tany 15916 bu2 Mon Aug 23 08:43:59 2004 base 15909 buhbuh Mon Aug 23 08:37:10 2004 tany 15916 bu2 Mon Aug 23 16:38:03 2004 ----/////////--------------------- 15909 DENY_NONE 0x2019f RDWR NONE /data/base/160404/SC5847.DBF Mon Aug 23 15:30:01 2004 15904 DENY_NONE 0x2019f RDWR NONE /data/base/160404/SC5847.DBF Mon Aug 23 09:24:27 2004 15907 DENY_NONE 0x2019f RDWR NONE /data/base/160404/SC5847.DBF Mon Aug 23 08:52:06 2004 15916 DENY_NONE 0x2019f RDWR NONE /data/base/160404/SC5847.DBF Mon Aug 23 08:45:10 2004 15909 DENY_NONE 0x2019f RDWR NONE /data/base/160404/CL. DBF Mon Aug 23 15:30:08 2004 15913 DENY_NONE 0x2019f RDWR NONE /data/base/160404/CL. DBF Mon Aug 23 10:31:43 2004 15904 DENY_NONE 0x2019f RDWR NONE /data/base/160404/CL. DBF Mon Aug 23 09:24:40 2004 15907 DENY_NONE 0x2019f RDWR NONE /data/base/160404/CL. DBF Mon Aug 23 08:52:17 2004 15916 DENY_NONE 0x2019f RDWR NONE /data/base/160404/CL. DBF Mon Aug 23 08:45:21 2004 15909 DENY_NONE 0x2019f RDWR NONE /data/base/160404/DT6704.DBF Mon Aug 23 15:30:06 2004 15904 DENY_NONE 0x2019f RDWR NONE /data/base/160404/DT6704.DBF Mon Aug 23 09:24:37 2004 15907 DENY_NONE 0x2019f RDWR NONE /data/base/160404/DT6704.DBF Mon Aug 23 08:52:14 2004 15916 DENY_NONE 0x2019f RDWR NONE /data/base/160404/DT6704.DBF Mon Aug 23 08:45:18 2004 15909 DENY_NONE 0x2019f RDWR NONE /data/base/160404/SC6065.DBF Mon Aug 23 15:30:02 2004 15904 DENY_NONE 0x2019f RDWR NONE /data/base/160404/SC6065.DBF Mon Aug 23 09:24:29 2004 15907 DENY_NONE 0x2019f RDWR NONE /data/base/160404/SC6065.DBF Mon Aug 23 08:52:07 2004 15916 DENY_NONE 0x2019f RDWR NONE /data/base/160404/SC6065.DBF Mon Aug 23 08:45:11 2004 16903 DENY_NONE 0x2019f RDWR NONE /data/base/160404/SC5771.CDX Mon Aug 23 16:38:31 2004 15909 DENY_NONE 0x2019f RDWR NONE /data/base/160404/SC5771.CDX Mon Aug 23 15:30:00 2004 15904 DENY_NONE 0x2019f RDWR NONE /data/base/160404/SC5771.CDX Mon Aug 23 09:24:26 2004 15907 DENY_NONE 0x2019f RDWR NONE /data/base/160404/SC5771.CDX Mon Aug 23 08:52:05 2004 15916 DENY_NONE 0x2019f RDWR NONE /data/base/160404/SC5771.CDX Mon Aug 23 08:45:09 2004 15909 DENY_NONE 0x2019f RDWR NONE /data/base/160404/DT25407.CDX Mon Aug 23 15:30:07 2004 15913 DENY_NONE 0x2019f RDWR NONE /data/base/160404/DT25407.CDX Mon Aug 23 10:31:40 2004 15904 DENY_NONE 0x2019f RDWR NONE /data/base/160404/DT25407.CDX Mon Aug 23 09:24:38 2004 15907 DENY_NONE 0x2019f RDWR NONE /data/base/160404/DT25407.CDX Mon Aug 23 08:52:15 2004 15916 DENY_NONE 0x2019f RDWR NONE /data/base/160404/DT25407.CDX Mon Aug 23 08:45:18 2004 15909 DENY_NONE 0x2019f RDWR NONE /data/base/160404/DT2998.DBF Mon Aug 23 15:30:06 2004 History of process 15913 [2004/08/23 11:10:51, 1] smbd/service.c:close_cnum(837) zimba (10.0.0.57) closed connection to service HpLj1100 [2004/08/23 13:09:17, 0] lib/smbldap.c:smbldap_connect_system(796) failed to bind to server with dn= cn=Manager,dc=nego,dc=ru Error: Can't contact LDAP server (unknown) [2004/08/23 13:09:17, 0] lib/smbldap.c:smbldap_search_suffix(1101) smbldap_search_suffix: Problem during the LDAP search: (unknown) (Timed out) [2004/08/23 13:09:33, 0] lib/smbldap.c:smbldap_search_suffix(1101) smbldap_search_suffix: Problem during the LDAP search: (unknown) (Timed out) [2004/08/23 13:09:33, 0] rpc_server/srv_netlog_nt.c:get_md4pw(244) get_md4pw: Workstation ZIMBA$: no account in domain [2004/08/23 13:09:48, 0] lib/smbldap.c:smbldap_search_suffix(1101) smbldap_search_suffix: Problem during the LDAP search: (unknown) (Timed out) [2004/08/23 13:09:48, 0] rpc_server/srv_netlog_nt.c:get_md4pw(244) get_md4pw: Workstation ZIMBA$: no account in domain [2004/08/23 14:55:42, 1] smbd/service.c:make_connection_snum(648) zimba (10.0.0.57) connect to service base initially as user buh2 (uid=1011, gid=513) (pid 15913) [2004/08/23 14:56:39, 1] smbd/service.c:close_cnum(837) zimba (10.0.0.57) closed connection to service base [2004/08/23 15:17:24, 1] smbd/service.c:make_connection_snum(648) zimba (10.0.0.57) connect to service base initially as user buh2 (uid=1011, gid=513) (pid 15913) [2004/08/23 15:18:16, 1] smbd/service.c:close_cnum(837) zimba (10.0.0.57) closed connection to service base [2004/08/23 15:22:00, 0] lib/smbldap.c:smbldap_connect_system(796) failed to bind to server with dn= cn=Manager,dc=nego,dc=ru Error: Can't contact LDAP server (unknown) [2004/08/23 15:22:00, 0] lib/smbldap.c:smbldap_search_suffix(1101) smbldap_search_suffix: Problem during the LDAP search: (unknown) (Timed out) [2004/08/23 15:22:00, 1] smbd/service.c:make_connection_snum(648) zimba (10.0.0.57) connect to service HpLj1100 initially as user buh2 (uid=1011, gid=513) (pid 15913) [2004/08/23 15:23:23, 1] smbd/service.c:close_cnum(837) zimba (10.0.0.57) closed connection to service HpLj1100 [2004/08/23 16:06:56, 0] lib/fault.c:fault_report(36) =============================================================== [2004/08/23 16:06:56, 0] lib/fault.c:fault_report(37) INTERNAL ERROR: Signal 11 in pid 15913 (3.0.6) Please read the appendix Bugs of the Samba HOWTO collection [2004/08/23 16:06:56, 0] lib/fault.c:fault_report(39) =============================================================== [2004/08/23 16:06:56, 0] lib/util.c:smb_panic2(1385) PANIC: internal error [2004/08/23 16:06:56, 0] lib/util.c:smb_panic2(1393) BACKTRACE: 38 stack frames: #0 smbd(smb_panic2+0x111) [0x81d4ea1] #1 smbd(smb_panic+0x1a) [0x81d4d8a] #2 smbd [0x81c2c88] #3 /lib/tls/libc.so.6 [0x192b88] #4 /usr/lib/libldap.so.2(ldap_int_open_connection+0x253) [0xf84163] #5 /usr/lib/libldap.so.2(ldap_new_connection+0x89) [0xf96019] #6 /usr/lib/libldap.so.2(ldap_open_defconn+0x41) [0xf83ae1] #7 /usr/lib/libldap.so.2(ldap_send_initial_request+0x8f) [0xf95b8f] #8 /usr/lib/libldap.so.2(ldap_sasl_bind+0x177) [0xf8bb07] #9 /usr/lib/libldap.so.2(ldap_sasl_bind_s+0x9b) [0xf8bddb] #10 /usr/lib/libldap.so.2(ldap_simple_bind_s+0x6c) [0xf8c60c] #11 smbd [0x82445bd] #12 smbd [0x824487a] #13 smbd [0x8244bcb] #14 smbd(smbldap_search+0x12a) [0x8244d8a] #15 smbd(smbldap_search_suffix+0x44) [0x8245494] #16 smbd [0x81ac81c] #17 smbd [0x81afd74] #18 smbd [0x81afe1c] #19 smbd [0x81a6d85] #20 smbd(pdb_getsampwsid+0x35) [0x81a8ac5] #21 smbd(make_server_info_guest+0x89) [0x82124c9] #22 smbd [0x8210382] #23 smbd [0x820afaf] #24 smbd [0x82140c9] #25 smbd [0x8105261] #26 smbd(ntlmssp_update+0x18b) [0x81047fb] #27 smbd(auth_ntlmssp_update+0x3d) [0x82143ed] #28 smbd [0x80ad841] #29 smbd [0x80adb14] #30 smbd(reply_sesssetup_and_X+0xe6b) [0x80aeb0b] #31 smbd [0x80d1a66] #32 smbd [0x80d1cf0] #33 smbd(process_smb+0x8c) [0x80d1efc] #34 smbd(smbd_process+0x168) [0x80d2c18] #35 smbd(main+0x4dd) [0x824796d] #36 /lib/tls/libc.so.6(__libc_start_main+0xf0) [0x1808a0] #37 smbd(ldap_msgfree+0x89) [0x80787b1] This problem with samba3.0.6... with 3.0.4 no such problem. (I am Rusian and excuse my English) Best regards. Denis Kirin
I see this bug with Samba 3.0.7 as well, here is my smbstatus output: Samba version 3.0.7 PID Username Group Machine ------------------------------------------------------------------- <snipped several entries, none obviously 31972> <snipped irrelevant tables> Pid DenyMode Access R/W Oplock Name -------------------------------------------------------------- 31972 DENY_NONE 0x3 RDWR NONE /mnt/dir2/foobar0.tgz Tue Feb 15 14:00:37 2005 31972 DENY_NONE 0x3 RDWR NONE /mnt//dir2/foobar1.tgz Tue Feb 15 13:44:14 2005 31972 DENY_NONE 0x3 RDWR NONE /mnt//dir2/foobar2.tgz Tue Feb 15 13:29:34 2005 31972 DENY_NONE 0x3 RDWR NONE /mnt//dir2/foobar3.tgz Tue Feb 15 13:58:07 2005 <snipped several other locks by this same pid> I also see the filenames when I run strings on the locking.tdb
Can this be reproduced against 3.0.11 ? If so, please reopen the bug and update the version
I think I see this problem with printer queus as well (in printers/printername.tdb) I see the queue from a windows client full of print jobs which are not active anymore. The TDB shows those entries as well.
Bug exist...
(In reply to comment #4) > Bug exist... > Bug exist in Samba 3.0.11. Smbstatus shows Pid of SMBD process whitch was crasded and killed. To repair problem I need to restart samba because I can not kill unexisten proccess. Exuse my english.
Just adding to this bugreport, as I'm seeing similar problems. I've got 2 Samba servers serving as (1) PDC and print server and (2) file server. The hardware in question is IBM HS20 bladeservers (2x2.8GHz Xeon, 4G ram, SAN storage). The OS is SLES 9 fully patched, the samba packages comes from SerNet.de. The clients are Windows 2000 PC, 3300 of them. The servers have frequent lockups, mainly at peak hours. The PDC more than the fileserver. As they are production servers one has to press "the reset button" asap to get the servers alive again. But once the PDC went down off hours and one could see that several nmdb prosesses were taking all the resources leaving the server with 100+ in load. "#killall -9 nmbd" returned the server to normal state. No logs during or after the lockup are available, as nothing is writen in the logs after lockup. The lockup comes abruptly, a top session grinds to a halt showing load of ~1.75. Logs right before the halting are filled with: rumba smbd[19007]: [2005/04/04 11:33:57, 0] tdb/tdbutil.c:tdb_log(725) rumba smbd[19007]: tdb(/var/lib/samba/locking.tdb): tdb_lock failed on list 78 ltype=1 (Resource deadlock avoided) smbd[19007]: [2005/04/04 11:33:57, 0] smbd/oplock.c:oplock_break(708) smbd[19007]: oplock_break: unable to lock share entry for file directory/batchfile.bat smbd[19007]: [2005/04/04 11:33:57, 0] smbd/oplock.c:process_local_message(451) smbd[19007]: process_local_message: oplock break failed. smbd[19007]: [2005/04/04 11:33:57, 0] tdb/tdbutil.c:tdb_log(725) smbd[19007]: tdb(/var/lib/samba/locking.tdb): tdb_lock failed on list 78 ltype=1 (Resource deadlock avoided) smbd[19007]: [2005/04/04 11:33:57, 0] tdb/tdbutil.c:tdb_log(725) smbd[19007]: tdb(/var/lib/samba/locking.tdb): tdb_lock failed on list 78 ltype=0 (Resource deadlock avoided) smbd[19007]: [2005/04/04 11:33:57, 0] tdb/tdbutil.c:tdb_log(725) smbd[19007]: tdb(/var/lib/samba/locking.tdb): tdb_unlock: count is 0 for different smbd prosesses. This behavior has been experienced with 3.0.11 and 3.0.13 versions from build by sernet.de Time for a downgrade :-)
Ok, if this is seen when nmbd is spinning, it's probably not a smbd issue but one of resource starvation when nmbd is undergoing problems. Can you attach to the spinnign nmbd with gdb and get us a backtrace ? Thanks, Jeremy.
no feedback from reporter in over a month. closing.