Brief version: # net -U sysadmin user info root Password: xxxxx Segmentation fault More info: Samba setup as PDC with LDAP backend on OpenLDAP2.1 with the new samba schema. What works: Logining in and adding machines on-the-fly works well. What doesn't: the net user info thing, and, maybe related, the usrmgr.exe windows tool complains of "Stub received bad data". More details: So far, i haven't recompiled the net tool, nor the libraries with -g, but this is gdb's output so far: (trimmed to remove unneeded stuff, eg copyright mention) # gdb net GNU gdb 2002-04-01-cvs <<<<<<snip>>>>>>> (gdb) run -U root user info root Starting program: /usr/bin/net -U root user info root (no debugging symbols found)...(no debugging symbols found)... <<<<snip more of these>>>> (no debugging symbols found)...(no debugging symbols found)... Password: xxxxx (no debugging symbols found)... Program received signal SIGSEGV, Segmentation fault. 0x0806e1d6 in net_rpc_getsid () (gdb) (gdb) bt #0 0x0806e1d6 in net_rpc_getsid () #1 0x0806d8df in net_rap () #2 0x0806e273 in net_rpc_getsid () #3 0x0806ac02 in net_run_function () #4 0x0806e513 in net_rpc_user () #5 0x0806b20e in net_make_ipc_connection () #6 0x0806ac02 in net_run_function () #7 0x0806bc05 in main () #8 0x400d5a51 in __libc_start_main () from /lib/libc.so.6 (gdb) quit Ok, so unless this is pretty obvious, then you'll need debugging symbols.. I'm a coder, so I'll take a gander at the gdb bt and src once I've recompiled, but for now, I'm hoping someone will say: "Oi! Its this simple" or "oops, forgot one tiny thing, here's a patch"... If desired, I'll attach my (editted for security reasons) smb.conf file, and anything else. My suspicion, though I'm not a samba coder, is the problem may lie in the samba server rather than the net client (though obviously, there is a bug in the net client/library since it shouldn't crash on bad input, but thats a trivial bug compared to samba producing bad output). If I can confirm samba is producing bad output (see above re usrmgr.exe), then I'll switch this bug report to the daemon rather than the net client. I've set the severity to normal, since afaik, I'm the only person to experience this problem, and as much as I'd like to believe i'm the centre of the universe, my problems are not the end all of samba *grin*. Any help, etc, would be appreciated, you are dealing with a seasoned coder here who would like to see this problem sorted, so any requests for more info will be answered. Thanks.
Ok, recompiled net with debugging and non-stripped.. Here's the gdb backtrace of the problem: Starting program: /usr/local/src/samba-3.0.0beta3/source/bin/net -U sysadmin user info root Password: Program received signal SIGSEGV, Segmentation fault. 0x0806e1d6 in rpc_user_info_internals (domain_sid=0x8228b30, cli=0x8208240, mem_ctx=0x821ae30, argc=1, argv=0x817a514) at utils/net_rpc.c:740 740 rids[i] = user_gids[i].g_rid; (gdb) bt #0 0x0806e1d6 in rpc_user_info_internals (domain_sid=0x8228b30, cli=0x8208240, mem_ctx=0x821ae30, argc=1, argv=0x817a514) at utils/net_rpc.c:740 #1 0x0806d8df in run_rpc_command (cli_arg=0x0, pipe_idx=2, conn_flags=0, fn=0x806e080 <rpc_user_info_internals>, argc=1, argv=0x817a514) at utils/net_rpc.c:151 #2 0x0806e273 in rpc_user_info (argc=1, argv=0x817a514) at utils/net_rpc.c:771 #3 0x0806ac02 in net_run_function (argc=2, argv=0x817a510, table=0xbffff098, usage_fn=0x806de0c <rpc_user_usage>) at utils/net.c:131 #4 0x0806e513 in net_rpc_user (argc=2, argv=0x817a510) at utils/net_rpc.c:878 #5 0x0806b20e in net_user (argc=2, argv=0x817a510) at utils/net.c:315 #6 0x0806ac02 in net_run_function (argc=3, argv=0x817a50c, table=0x8164058, usage_fn=0x806c090 <net_help>) at utils/net.c:131 #7 0x0806bc05 in main (argc=6, argv=0xbffff3c4) at utils/net.c:681 (gdb) Core file is available on request, but only direct email (due to possible security problems).
Ok, yeah, following the gdb backtrace, etc, I've found: The failure line: (net_rpc.c:740) rids[i] = user_gids[i].g_rid; Is caused because user_gids is pointing to null. (that'll do it). Thats caused because... result = cli_samr_query_usergroups(cli, mem_ctx, &user_pol, &num_rids, &user_gids); Is suppose to fill user_gids, but doesn't.... (leaves it as NULL)... num_rids is set to 1, so that'll make that loop happen (line 739) result is 0 and it appears that 0 is a successful return.. Ok, unless I'm missing something, then it appears that the fault is with the RPC server rather than the client (as I initialy suspected). Samba daemon is returning success, setting num_rids to 1, but not filling user_gids in. *MAYBE* the problem is in the client side RPC code, but I suspect that if it was, it would have been found by now.. Ok... so off to the samba internals.. (btw - I think at this stage this bug may need to be reassigned to someone else... not sure who)... Interesting... rpc_server/srv_samr_nt.c:2050 Function: _samr_query_usergroups /* construct the response. lkclXXXX: gids are not copied! */ init_samr_r_query_usergroups(r_u, num_groups, gids, r_u->status); Does that comment mean anything... eg perhaps the problem? Not sure on my part.. lkcl is Luke Leighton <lkcl@switchboard.net> I'm going to email him and bring his attention to this bug report, and see if I'm on the right track... Actually... from this point, I'm lost... Need a developer to come look-see and figure out whats going on.
Bingo. Ok, in my test compile, I changed the backend from ldap to smbpasswd. And it works. So the problem has been isolated to something to do with the ldap backend. (That explains why everyone doesn't have this problem). If you need any help setting up an ldap backend, have done it, so can give ldiffs of ou=People tree. Just to note; i'm using ldap for samba and for general authentication (eg nothing is stored in /etc/passwd or /etc/samba/smbpasswd So who's incharge of the ldap backend? This... could be a configuration problem... I never did anything about the groups in the ldap (there are normal posix groups, but I never did anything about samba groups, etc). I couldn't find any consistant information on what I'm suppose to do. Do I attach groups, etc?... Even if this is a configuration problem, still, it indicates a bug in the code somewhere (programs shouldn't segv), though a not so high one. I'll go look through as much documentation as I can find, see if there is anything I missed, in the mean time, anyone have any other ideas?
Ok, found it some more info, finally. Removed the ldap group suffix entry in the smb.conf file, that seemed to clear up some problems.... I think Things start to work better. Interesting though, `net user info root` now just outputs nothing. (No apparent errors in the logs/debug output though. ... Confirmed, Windows's usrmgr.exe now works.. Ok, in summary (and I'll just point out, without even a developer commenting on any of this), There are a few bugs. 1) `net user show` SEGV's if samba gives out rubbish 2) Samba gives out rubbish if incorrectly configured (not a bug, kinda) 3) Samba does not seem to use the "ldap group suffix" entry correctly (when set to ou=Group, gives out rubbish, when set to nothing, works). 4) Documentation on this is very hard to find. (or non existant). If anyone wants any test cases, ldiffs from my ldap server, copies of the smb.conf file I used for testing, core's from crashes, etc, etc, etc. Please feel free to email me. Otherwise, let this bug report serve as a warning to other users regarding groups, etc. (Also, imho, a good discussion on how to diagnose and report a bug beyond "It doesn't work")
checked in 3.0.2a and this is ok
originally reported against 3.0.0beta3. CLeaning out non-production release versions.
sorry for the same, cleaning up the database to prevent unecessary reopens of bugs.
database cleanup