Samba 3.2.1 on Sun Sparc Solaris 10 built with gcc 3.4.3 AD Member server using idmap_rid attempts to access resources from client PCs result in "access denied ... the network path was not found" or "The specified network name is no longer available" Smblog file for the client shows an internal error for an smbd process. using smbclient to test access to the server worked once, then subsequent attempts yielded "Receiving SMB: Server stopped responding session setup failed: Call timed out: server did not respond after 20000 milliseconds" I am not seeing this problem on a server running Samba 3.2.1 under Sparc Solaris 9 built with gcc 3.4.6 smb.conf, smblog, and backtrace from gdb available at http://urban.csuohio.edu/~bob/samba_3.2.1
It's dying at line 104 in smbd/session.c #9 0x0007b5b0 in session_claim (vuser=0x65d130) at smbd/session.c:104 sess_pid = {pid = 0} key = {dptr = 0xffbfdff8 "ID/1", dsize = 5} data = {dptr = 0x0, dsize = 0} i = 1 sessionid = {uid = 0, gid = 0, username = '\0' <repeats 255 times>, hostname = '\0' <repeats 255 times>, netbios_name = '\0' <repeats 255 times>, remote_machine = '\0' <repeats 255 times>, id_str = '\0' <repeats 255 times>, id_num = 0, pid = {pid = 0}, ip_addr_str = '\0' <repeats 255 times>, connect_start = 0} pid = {pid = 22767} keystr = "ID/1", '\0' <repeats 251 times> hostname = 0x634a24 "" ctx = (struct db_context *) 0x64c178 rec = (struct db_record *) 0x6951e8 status = {v = 2420} addr = '\0' <repeats 45 times> __FUNCTION__ = "session_claim" That line is quite simple : 104 rec = ctx->fetch_locked(ctx, NULL, key); can you reproduce it and then dump out the contents of the ctx struct pointer. I can't see what on that line might cause a panic other than a bad pointer for ctx->fetch_locked. Jeremy.
(In reply to comment #1) > It's dying at line 104 in smbd/session.c > .. > can you reproduce it and then dump out the contents of the ctx struct pointer. > I can't see what on that line might cause a panic other than a bad pointer for > ctx->fetch_locked. Sorry to be so clueless. I should be able to reproduce the problem, but don't know how to dump out the contents of the pointer - so a pointer on that would be appreciated. Thanks, Bob
(In reply to comment #0) > Samba 3.2.1 on Sun Sparc Solaris 10 built with gcc 3.4.3 > > AD Member server using idmap_rid Finally getting back to this project/issue. Seeing the same problem with Samba 3.2.2. smb.conf, smblog, and backtrace from gdb available at http://urban.csuohio.edu/~bob/samba_3.2.2
Can you try recompiling with Sun Studio? Volker
(In reply to comment #4) > Can you try recompiling with Sun Studio? Not right away. Any particular version of Sun Studio I should try? I have a machine with Sun Studio 11,REV=2005.10.13 on it currently. I wil lhave to make sure the other packages (openssl, openldap, etc) are installed and the same version as the ones on my test machine. -Bob
(In reply to comment #4) > Can you try recompiling with Sun Studio? Built 3.2.2 with Sun Studio 11 on Solaris 10. Right now I am double-checking my install b/c when accessing the server as ad AD user authentication fails. I think Samba "sees" who I am from AD b/c I have seen my real name in the error messages associated w/ my AD login name.
Rebuilt kerberos5, sasl, openldap, and Samba all with gcc 3.4.6 under Solaris 10. Still seeing the same problem with PANIC messages in the samba logs. With Sun Studio 11 under Solaris 10 of a different machine I was unable to get authentication working even this far - nothing but "NT_STATUS_LOGON_FAILURE" messages.
Can you please upload the debug level 10 log of the Sun Studio compile leading to the NT_STATUS_LOGON_FAILURE? Thanks, Volker
Machine with Samba built with Sun Studio 11 joined to AD. Log from the server attached, it and other log files at http://urban.csuohio.edu/~bob/samba_3.2.2/studio11/ Tried the following: # wbinfo -t checking the trust secret via RPC calls succeeded # wbinfo -a 1001362%********* plaintext password authentication succeeded challenge/response password authentication succeeded # smbclient -U 1001362 -L austin Enter 1001362's password: session setup failed: NT_STATUS_LOGON_FAILURE Unable to connect from a windows client - pronpted for password over and over.
Ok, now this time username 1001362 is not valid on your system. Can you create a user that does not begin with a digit? Volker
(In reply to comment #10) > Ok, now this time username 1001362 is not valid on your system. Can you create > a user that does not begin with a digit? Sorry to say I cannot do that. My group does not have any control over the Active Directory server (and they don't help us out much, either) so all my user accounts will be in the form of seven digits. I have access to a test account that does not have any digits in its user name which I can try for testing. So far I've not had any of these issues on the Solaris 9 test server running Samba 3.2.2. I wonder why when built with Sun Studio on Solaris 10 it ends up not liking the account names when samba fails in a different place when built with gcc. ---------- I tried to access the server using the test account "martel-test" and it look like I am back to the earlier failure mode - I can't access the server and see a panic message in the log file. running smbclient -L techops -Umartel-test from the command line works ONCE, but then fails on later attempts. If I list a directory with files owned by AD users, I see the AD users listed under user and group: #techops# ls -l total 4 -rwxrw-rw- 1 1001362 10002 0 May 23 12:01 may22file.txt -rw-r--r-- 1 1001362 10002 0 May 23 12:04 may22file.txt-2 drwxr-xr-x 2 1001362 10002 512 May 23 12:01 may22folder -rwxrw-rw- 1 1001362 10002 159 Apr 23 11:57 new text document.txt -rw-r--r-- 1 1001362 domain users 0 Aug 13 09:37 samba_3.2.1_test.txt Logs from this attempt found are at http://urban.csuohio.edu/~bob/samba_3.2.2/studio11/try2/ Thank you.
> -rwxrw-rw- 1 1001362 10002 0 May 23 12:01 may22file.txt > -rw-r--r-- 1 1001362 10002 0 May 23 12:04 may22file.txt-2 > drwxr-xr-x 2 1001362 10002 512 May 23 12:01 may22folder > -rwxrw-rw- 1 1001362 10002 159 Apr 23 11:57 new text document.txt > -rw-r--r-- 1 1001362 domain users 0 Aug 13 09:37 samba_3.2.1_test.txt Do you use "winbind use default domain = yes"? The problem here is that it is not clear whether 1001362 is a user name or a numeric uid, and something in the NSS system is confused. You might work around this problem by removing "winbind use default domain = yes". > Logs from this attempt found are at > http://urban.csuohio.edu/~bob/samba_3.2.2/studio11/try2/ This looks like a pretty normal access to me. Anything that did not work here? The line Get_Pwnam_internals did find user [CSUNET\martel-test]! shows that you do not use "winbind use default domain", so something in your NSS system is confused about 1001362, this should show a user name. The other log showed that the user CSUNET\10011362 (or so, the numeric one) can not be found. Is it possible that your libnss_winbind/libwbclient does not match the winbind you installed? Volker
> > Logs from this attempt found are at > > http://urban.csuohio.edu/~bob/samba_3.2.2/studio11/try2/ > > This looks like a pretty normal access to me. Anything that did not work here? Yes - I cannot access the server via Samba: No access and Samba panics > The line > > Get_Pwnam_internals did find user [CSUNET\martel-test]! > > shows that you do not use "winbind use default domain",...Is it possible that your libnss_winbind/libwbclient does not match the > winbind you installed? unlikely, but I will check that. "1001362" is the user login name...the directory listing above is actually what I expected (and wanted) to see on the AD member server. The idmap_rid gave AD user "1001362" the unix UID of 10513 - the same UID number used on my Solaris 10 and Solaris 9 test servers. The Solaris 9 server is not displaying these issues and I can access it from client PCs using AD accounts A section from the smb.conf - I *am* using "use default domain" ... idmap domains = CSUNET template homedir = /home/%U template shell = /usr/bin/bash winbind use default domain = Yes idmap config CSUNET:range = 10000-100000000 idmap config CSUNET:base_rid = 0 idmap config CSUNET:backend = rid idmap config CSUNET:default = yes What happened on this latest attempt was when I used the "martel-test" AD account to attempt access to the Samba server on the Solaris 10 box built with Sun Studio it failed - in a manner that looked about the same as the failure mode on the version of samba built with gcc on this platform: Samba panics. From the log file smblog.137.148.92.196: [2008/10/01 08:38:33, 3] smbd/password.c:register_existing_vuid(314) register_existing_vuid: User name: CSUNET\ur20-02$ Real name: UR20-02$ [2008/10/01 08:38:33, 3] smbd/password.c:register_existing_vuid(326) register_existing_vuid: UNIX uid 211082 is UNIX user CSUNET\ur20-02$, and will be vuid 104 [2008/10/01 08:38:33, 10] lib/dbwrap_tdb.c:db_tdb_fetch_locked(100) Locking key 49442F3100 [2008/10/01 08:38:33, 10] lib/dbwrap_tdb.c:db_tdb_fetch_locked(129) Allocated locked data 0x694170 [2008/10/01 08:38:33, 0] lib/fault.c:fault_report(40) =============================================================== [2008/10/01 08:38:33, 0] lib/fault.c:fault_report(41) INTERNAL ERROR: Signal 10 in pid 23624 (3.2.2) Please read the Trouble-Shooting section of the Samba3-HOWTO [2008/10/01 08:38:33, 0] lib/fault.c:fault_report(43) From: http://www.samba.org/samba/docs/Samba3-HOWTO.pdf [2008/10/01 08:38:33, 0] lib/fault.c:fault_report(44) =============================================================== [2008/10/01 08:38:33, 0] lib/util.c:smb_panic(1663) PANIC (pid 23624): internal error [2008/10/01 08:38:33, 0] lib/util.c:log_stack_trace(1817) unable to produce a stack trace on this platform It looks to me that building Samba with Sun Studio under Solaris 10 added a new problem: not functioning with the all digit AD user account names - a problem I did not see before, and the original issue with not being able to access the shares using an AD account from client machines persists. From a client PC logged in as an AD user I can browse to the Samba server, It will show me the shares available, but as soon as I attempt to access one of those shares I receive an error message om the client and see the panic messages in the samba logs on the server. Thank you Bob
Sorry, had missed the panic. Now I'm completely lost how this line might get a SIGBUS. I need ssh access to a box that shows this behaviour to figure out more. Sorry, Volker
Please contact me via Email so I can set up this access for you. -Bob
Today I tried to install Samba 3.2.4 on a Solaris 9 machine - it was working on a test machine So I decided to try it on a production machine that does not normally run samba. When I try to access it from a client PC I get "Network name no longer available" messages. The log file seems to indicate that Samba is stopping in the same place: [2008/10/03 14:13:42, 3] smbd/password.c:register_existing_vuid(326) register_existing_vuid: UNIX uid 101888 is UNIX user CSUNET\1001362, and will be vuid 101 [2008/10/03 14:13:42, 10] lib/dbwrap_tdb.c:db_tdb_fetch_locked(100) Locking key 49442F3100 [2008/10/03 14:13:42, 10] lib/dbwrap_tdb.c:db_tdb_fetch_locked(129) Allocated locked data 0x6d0730 [2008/10/03 14:13:42, 0] lib/fault.c:fault_report(40) =============================================================== [2008/10/03 14:13:42, 0] lib/fault.c:fault_report(41) INTERNAL ERROR: Signal 10 in pid 11819 (3.2.4) ... =============================================================== [2008/10/03 14:13:42, 0] lib/util.c:smb_panic(1663) PANIC (pid 11819): internal error There a some differences between the Solaris 9 box I was using for testing, and this "production" box as far as installed software goes (versions of OpenSSL for example.)
Add the line : panic action = /bin/sleep 999999 to the [global] section of your smb.conf and reproduce the panic. That should allow you to attach to the panic'ed process and get a backtrace with symbols. Jeremy.
Created attachment 3668 [details] samba 3.2.4 full backtrace on Solaris 9 AD member server
At the risk of muddying the waters still further attached is a back trace from a Solaris 9 machine running Samba 3.2.4 which is an Active Directory member server. It seems to be failing the same way I've been seeing on my Solaris 10 server. Seeing "the specified network name is no longer available" on the cient PC. From the log file: [2008/10/07 14:57:59, 3] smbd/password.c:register_existing_vuid(326) register_existing_vuid: UNIX uid 211082 is UNIX user CSUNET\ur20-02$, and will be vuid 104 [2008/10/07 14:57:59, 10] lib/dbwrap_tdb.c:db_tdb_fetch_locked(100) Locking key 49442F3100 [2008/10/07 14:57:59, 10] lib/dbwrap_tdb.c:db_tdb_fetch_locked(129) Allocated locked data 0x72cd18 [2008/10/07 14:57:59, 0] lib/fault.c:fault_report(40) =============================================================== [2008/10/07 14:57:59, 0] lib/fault.c:fault_report(41) INTERNAL ERROR: Signal 10 in pid 22077 (3.2.4) Please read the Trouble-Shooting section of the Samba3-HOWTO [2008/10/07 14:57:59, 0] lib/fault.c:fault_report(43) From: http://www.samba.org/samba/docs/Samba3-HOWTO.pdf [2008/10/07 14:57:59, 0] lib/fault.c:fault_report(44) =============================================================== [2008/10/07 14:57:59, 0] lib/util.c:smb_panic(1663) PANIC (pid 22077): internal error Attached backtrace is from PID 22077.
Okay, *that* backtrace is different than the others. This one I can fix. Expect a patch v soon. Volker
Created attachment 3669 [details] patch Can you try the attached patch? Thanks, Volker
I tried the patch on both my Solaris 10 test server and the Solaris 9 server that were exhibiting problems. From the preliminary checking it looks like the patch corrected the problem I was seeing. I plan on doing some additional testing tomorrow.
From what I can see thus far Samba is operating the way I;d expect it to on both mt Solaris 9 and Solaris 10 test servers. Thanks very much for the patch! -Bob
Thanks for testing. Checked in the patch. Volker