Config: Samba Server Solaris 8 system for serving drive shares, authenticating against a W2K Active Directory, and hosting the OpenLDAP for SID=>uid/gid mappings (smbd, nmbd, winbindd, slapd). Samba Clients on Solaris 8 systems for authenticating against the W2K AD (nmbd, winbindd). Today we encountered a repeatable winbindd core dump on all Samba systems. It occurs soon after every startup. Unfortunately, this behavior is exibited on a network on which we have limited debugging capabilities. I was able to obtain a level 10 debug on with a "panic action" backtrace, but without symbol table info, on one of the Samba Server systems. I'll continue trying to duplicate what is causing the problem in our development lab, but other than a gut-feeling that this was caused by some configuration in the AD, we have no leads at the moment. Here's an excerpt of the stacktrace I have: ... params.c:pm_process() - Processing configuration file "/SMBSVR/smb.conf" ... idmap_init: using 'ldap' as remote backend ... The LDAP server is succesful connected ... Connected to LDAP server <IP> got ldap server name dns1@<REALM>, using bind path: <bind path> ... connecting to DNS1 from SAMBA with kerberos principal [SAMBA$@<REALM>] Doing spnego session setup (blob length=117) ... Doing kerberos session setup ... 000000 ds_io_g_getprimdominfo 0000 level: 0001 create_rpc_request: opnum: 0x0 data_len: 0x1a create_rpc_request: data_len: 1a auth_len: 0 alloc_hint: a 000000 smb_io_rpc_hdr hdr 0000 major : 05 0001 minor : 00 0002 pkt_type : 00 0003 flags : 03 0004 pack_type0: 10 0005 pack_type1: 00 0006 pack_type2: 00 0007 pack_type3: 00 0008 frag_len : 001a 000a auth_len : 0000 000c call_id : 00000002 000010 smb_io_rpc_hdr_req rpc_hdr_req 0010 alloc_hint: 0000000a 0014 context_id: 0000 0016 opnum : 0000 rpc_api_pip: fnum:401c size=108 smb_com=0x25 smb_rcls=0 smb_reh=0 smb_err=0 smb_flg=8 smb_flg2=51201 smb_tid=24576 smb_pid=14373 smb_uid=40963 smb_mid=6 smb_wct=16 smb_vwv[ 0]= 0 (0x0) smb_vwv[ 1]= 26 (0x1a) smb_vwv[ 2]= 0 (0x0) smb_vwv[ 3]= 4280 (0x10B8) smb_vwv[ 4]= 0 (0x0) smb_vwv[ 5]= 0 (0x0) smb_vwv[ 6]= 0 (0x0) smb_vwv[ 7]= 0 (0x0) smb_vwv[ 8]= 0 (0x0) smb_vwv[ 9]= 0 (0x0) smb_vwv[10]= 82 (0x52) smb_vwv[11]= 26 (0x1A) smb_vwv[12]= 82 (0x52) smb_vwv[13]= 2 (0x2) smb_vwv[14]= 38 (0x26) smb_vwv[15]=16412 (0x401C) smb_bcc=41 [000] 00 5C 00 50 00 49 00 50 00 45 00 5C 00 00 00 05 .\.P.I.P .E.\.... [010] 00 00 03 10 00 00 00 1A 00 00 00 02 00 00 00 0A ........ ........ [020] 00 00 00 00 00 00 00 01 00 ........ . write_socket(12,112) write_socket(12,112) wrote 112 got smb length of 88 size=88 smb_com=0x25 smb_rcls=0 smb_reh=0 smb_err=0 smb_flg=136 smb_flg2=51201 smb_tid=24576 smb_pid=14373 smb_uid=40963 smb_mid=6 smb_wct=10 smb_vwv[ 0]= 0 (0x0) smb_vwv[ 1]= 32 (0x20) smb_vwv[ 2]= 0 (0x0) smb_vwv[ 3]= 0 (0x0) smb_vwv[ 4]= 56 (0x38) smb_vwv[ 5]= 0 (0x0) smb_vwv[ 6]= 32 (0x20) smb_vwv[ 7]= 56 (0x38) smb_vwv[ 8]= 0 (0x0) smb_vwv[ 9]= 0 (0x0) smb_bcc=33 [000] 00 05 00 02 03 10 00 00 00 20 00 00 00 02 00 00 ........ ........ [010] 00 08 00 00 00 00 00 00 00 00 00 00 00 05 00 00 ........ ........ [020] 00 . size=88 smb_com=0x25 smb_rcls=0 smb_reh=0 smb_err=0 smb_flg=136 smb_flg2=51201 smb_tid=24576 smb_pid=14373 smb_uid=40963 smb_mid=6 smb_wct=10 smb_vwv[ 0]= 0 (0x0) smb_vwv[ 1]= 32 (0x20) smb_vwv[ 2]= 0 (0x0) smb_vwv[ 3]= 0 (0x0) smb_vwv[ 4]= 56 (0x38) smb_vwv[ 5]= 0 (0x0) smb_vwv[ 6]= 32 (0x20) smb_vwv[ 7]= 56 (0x38) smb_vwv[ 8]= 0 (0x0) smb_vwv[ 9]= 0 (0x0) smb_bcc=33 [000] 00 05 00 02 03 10 00 00 00 20 00 00 00 02 00 00 ........ ........ [010] 00 08 00 00 00 00 00 00 00 00 00 00 00 05 00 00 ........ ........ [020] 00 . rpc_check_hdr: rdata->data_size = 32 000000 smb_io_rpc_hdr rpc_hdr 0000 major : 05 0001 minor : 00 0002 pkt_type : 02 0003 flags : 03 0004 pack_type0: 10 0005 pack_type1: 00 0006 pack_type2: 00 0007 pack_type3: 00 0008 frag_len : 0020 000a auth_len : 0000 000c call_id : 00000002 000010 smb_io_rpc_hdr_resp rpc_hdr_resp 0010 alloc_hint: 00000008 0014 context_id: 0000 0016 cancel_ct : 00 0017 reserved : 00 rpc_api_pipe: len left: 0 smbtrans read: 32 rpc_api_pipe: fragment first and last both set 000018 ds_io_r_getprimdominfo 0018 ptr: 00000000 001c status: NT code 0x00000005 =============================================================== INTERNAL ERROR: Signal 11 in pid 14373 (3.0.1) Please read the appendix Bugs of the Samba HOWTO collection =============================================================== smb_panic(): calling panic action [/tmp/JKoutsmb/backtrace 14373] ... #6 <signal handler called> #7 0xff090874 in memcpy () from /usr/platform/SUNW,Sun-Fire-480R/lib/libc_psr.so.1 #8 ... in cli_ds_getprimarydominfo () #9 ... in cm_check_for_native_mode_win2k () #10 ... in free_domain_list () #11 ... in init_domain_list () #12 ... in domain_list () #13 ... in find_domain_from_name () #14 ... in rescan_trusted_domains () #15 ... in winbind_client_read () #16 ... in main ()
Created attachment 361 [details] check pointer before calling memcpy()
Fixed for 3.0.2rc1
Any insight as to why this code was hit with the bad ptr? I ask only because after doing nothing on the system for about 4 hours, restarting winbindd was then successfull. When the SEGFAULT occurred, we had been restarting repeatedly over about 3 hours with the core dump continuously occurring. I'd like to know what could've changed in our system (without our involvement) over that time.
the rpc request was returning "access denied". Of the server decied to answer with an nt_status_ok, then the crash would not have occurred. Maybe related to wbinfo --set-auth-user credentials that were invalid for a time?
sorry for the same, cleaning up the database to prevent unecessary reopens of bugs.
database cleanup