Bug 5305 - 3.2.0-pre2 won't join ad
3.2.0-pre2 won't join ad
Status: RESOLVED FIXED
Product: Samba 3.2
Classification: Unclassified
Component: Client tools
3.2.0
x86 Solaris
: P3 normal
: ---
Assigned To: Jeremy Allison
Samba QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-03-05 17:23 UTC by mchugh19@yahoo.com
Modified: 2008-04-01 12:45 UTC (History)
2 users (show)

See Also:


Attachments
debug 10 logs (14.21 KB, application/x-gzip)
2008-03-05 17:24 UTC, mchugh19@yahoo.com
no flags Details
wireshark capture (11.89 KB, application/octet-stream)
2008-03-06 09:11 UTC, mchugh19@yahoo.com
no flags Details
smb.conf used (925 bytes, text/plain)
2008-03-06 17:51 UTC, mchugh19@yahoo.com
no flags Details
requested net output (29.23 KB, text/plain)
2008-03-06 17:54 UTC, mchugh19@yahoo.com
no flags Details
output of running net command from new 3.2 build (28.85 KB, text/plain)
2008-03-07 09:31 UTC, mchugh19@yahoo.com
no flags Details
Potential fix for the crash (446 bytes, patch)
2008-03-14 17:30 UTC, Volker Lendecke
no flags Details
/usr/local/samba/bin/net ads join -U mmchugh -d 10 2> /tmp/samba.out (17.99 KB, application/gzip)
2008-03-16 15:19 UTC, mchugh19@yahoo.com
no flags Details
Output of /usr/local/samba/bin/net ads join -U mmchugh -d 10 -v 2> /tmp/samba.out (18.58 KB, application/x-gzip)
2008-03-18 09:06 UTC, mchugh19@yahoo.com
no flags Details
Wireshark capture of previous net command (4.33 KB, application/octet-stream)
2008-03-18 09:07 UTC, mchugh19@yahoo.com
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description mchugh19@yahoo.com 2008-03-05 17:23:34 UTC
I seem to be unable to join an active directory domain.


net ads join -U mmchugh
Enter mmchugh's password:
Failed to join domain: failed to set machine spn: Constraint violation
 net rpc join -S students.froot.nau.edu -U mmchugh
Enter mmchugh's password:
[2008/03/06 05:02:47,  0] utils/net_rpc_join.c:net_rpc_join_newstyle(393)
  Error in domain join verification (credential setup failed): 
NT_STATUS_INVALID_COMPUTER_NAME

Unable to join domain NAU-STUDENTS.
 
This is on a machine that was previously joined with 3.0.28
Comment 1 mchugh19@yahoo.com 2008-03-05 17:24:07 UTC
Created attachment 3159 [details]
debug 10 logs
Comment 2 Volker Lendecke 2008-03-05 23:42:05 UTC
log.nmbd and log.smbd don't really help in this case. What we need is the output of the net command failing to join and a network trace of it. For the network trace look at http://wiki.samba.org/index.php/Capture_Packets.

Thanks,

Volker
Comment 3 mchugh19@yahoo.com 2008-03-06 09:11:08 UTC
Created attachment 3162 [details]
wireshark capture

wireshark capture of
root@egr214-01:/usr/local/src/samba-3.2.0pre2/source$ net ads join -U mmchugh
Enter mmchugh's password:
Failed to join domain: failed to set machine spn: Constraint violation
root@egr214-01:/usr/local/src/samba-3.2.0pre2/source$ net rpc join -S students.froot.nau.edu -U mmchugh
Enter mmchugh's password:
[2008/03/06 21:39:50,  0] utils/net_rpc_join.c:net_rpc_join_newstyle(393)
  Error in domain join verification (credential setup failed): NT_STATUS_INVALID_COMPUTER_NAME

Unable to join domain NAU-STUDENTS.
Comment 4 Volker Lendecke 2008-03-06 10:25:21 UTC
For some reason only the incoming packets have been captured. Is it possible that your outgoing traffic uses a different interface than the incoming?

Volker
Comment 5 Gerald (Jerry) Carter 2008-03-06 14:56:19 UTC
In addition to Volker's request for a complete network traces (of both sides
of communication), please attach your complete smb.conf file.  The log files 
raise some questions about whether all parameters have been set correctly. 
And include the output of 'net join ads -U Administrator -d 10'.  Thanks.


Thanks.
Comment 6 mchugh19@yahoo.com 2008-03-06 17:51:13 UTC
Created attachment 3164 [details]
smb.conf used

I'm afraid I don't have much of an answer as to why you would only have half the traffic. The command used was "tshark -p -w net-fail.log port 445 or port 139" I've just managed to get git installed in solaris, so I'll try pulling the samba-3.2 branch to grab those few compile updates and try it again.
Comment 7 Gerald (Jerry) Carter 2008-03-06 17:53:51 UTC
what does `hostname` return on your system?  You currently cannot join 
AD using a netbios name different than your hostname.  That's a bug
I know how to fix but it will take a bit or work to implement I think.

Comment 8 mchugh19@yahoo.com 2008-03-06 17:54:55 UTC
Created attachment 3165 [details]
requested net output

Output from running "net ads join -U mmchugh -d 10"
Comment 9 mchugh19@yahoo.com 2008-03-06 17:56:06 UTC
Hostname returns:
root@egr214-01:/usr/local/src/samba-3.2.0pre2/source$ hostname
egr214-01

The account exists in ad with the same name. I had compiled 3.0.28 on this machine with the same smb.conf and it worked then. 
Comment 10 mchugh19@yahoo.com 2008-03-07 09:31:51 UTC
Created attachment 3169 [details]
output of running net command from new 3.2 build

Downloaded git source with: git-clone git://git.samba.org/samba.git samba-3.2

Configured with:
./configure --prefix=/usr/local/samba --disable-shared-libs --with-shared-modules=rfc2307,nss_rfc2307 --with-winbind --with-acl-support --with-pam --with-krb5=/opt/csw/bin --with-ads--disable-pie

output looks simular
Comment 11 mchugh19@yahoo.com 2008-03-07 22:24:12 UTC
Is there anything else I can provide to assist?
Comment 12 Volker Lendecke 2008-03-08 03:34:19 UTC
Just looked at the output -- it segfaults, that's different :-). Can you recompile with -g, run the join under valgrind and post the output.

Volker
Comment 13 mchugh19@yahoo.com 2008-03-08 08:46:25 UTC
I'm running on solaris, which I don't believe is supported for valgrind. Are there any alternatives?
Comment 14 Volker Lendecke 2008-03-08 13:34:03 UTC
Run it under gdb an when the segfault happens, do a backtrace.

Volker
Comment 15 mchugh19@yahoo.com 2008-03-09 21:47:25 UTC
Maybe I'm just really bad at this, but I've recompiled with '-g' but there is not trace:

root@egr214-01:/usr/local/src/samba-3.2/source$ gdb --args /usr/local/samba/bin/net ads join
GNU gdb 6.6
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-pc-solaris2.8"...
(gdb) run
Starting program: /usr/local/samba/bin/net ads join -U mmchugh
warning: Temporarily disabling breakpoints for unloaded shared library "/usr/lib/ld.so.1"
warning: Lowest section in /usr/lib/libthread.so.1 is .dynamic at 00000074
Enter mmchugh's password:
Failed to join domain: failed to set machine spn: Constraint violation

Program exited with code 0377.
(gdb) bt
No stack.
(gdb)   



any ideas?
Comment 16 Volker Lendecke 2008-03-10 03:48:33 UTC
That's really strange. In the logfile the last lines are

  [010] 02 00 00 00                                       ....
[2008/03/07 22:27:39,  5] rpc_parse/parse_prs.c:prs_debug(88)
Segmentation Fault (core dumped)

indicating a crash. Under gdb it does not crash. Are you 100% certain that you're running the same binaries when running with and without gdb?

Volker
Comment 17 mchugh19@yahoo.com 2008-03-10 09:21:44 UTC
Hmmm. Running it with the -d 10, does cause the seg fault. Here's the last bit of it and the backtrace.

[2008/03/10 21:19:49,  5] rpc_client/cli_pipe.c:valid_pipe_name(1648)
  Bind Abstract Syntax: [000] 78 57 34 12 34 12 CD AB  EF 00 01 23 45 67 89 AB  xW4.4... ...#Eg..
  [010] 00 00 00 00                                       ....
[2008/03/10 21:19:49,  5] rpc_client/cli_pipe.c:valid_pipe_name(1651)
  Bind Transfer Syntax: [000] 04 5D 88 8A EB 1C C9 11  9F E8 08 00 2B 10 48 60  .]...... ....+.H`
  [010] 02 00 00 00                                       ....
[2008/03/10 21:19:49,  5] rpc_parse/parse_prs.c:prs_debug(88)

Program received signal SIGSEGV, Segmentation fault.
format_debug_text (msg=0x0) at lib/debug.c:909
909             for( i = 0; msg[i]; i++ ) {
(gdb) bt
#0  format_debug_text (msg=0x0) at lib/debug.c:909
#1  0x081727f8 in dbgtext (format_str=0x8398bdf "%*s") at lib/debug.c:1070
#2  0x081876d2 in tab_depth (level=0, depth=1) at lib/util.c:2186
#3  0x081006b0 in prs_debug (ps=0x1, depth=0, desc=0x839af21 "hdr", fn_name=0x8393358 "smb_io_rpc
#4  0x081189f7 in smb_io_rpc_hdr (desc=0x839af21 "hdr", rpc=0x8045e1c, ps=0x8045f44, depth=0) at
#5  0x081b6363 in create_bind_or_alt_ctx_internal (pkt_type=RPC_BIND, rpc_out=0x8045f44, rpc_call
    transfer=0x8045f60, phdr_auth=0x8045eb4, pauth_info=0x8045e98) at rpc_client/cli_pipe.c:1130
#6  0x081b67ce in create_rpc_bind_req (cli=0x846fd18, rpc_out=0x8045f44, rpc_call_id=1, abstract=
    auth_type=PIPE_AUTH_TYPE_NONE, auth_level=PIPE_AUTH_LEVEL_NONE) at rpc_client/cli_pipe.c:1229
#7  0x081b83fb in rpc_pipe_bind (cli=0x846fd18, auth_type=PIPE_AUTH_TYPE_NONE, auth_level=PIPE_AU
    at rpc_client/cli_pipe.c:2057
#8  0x081b8cba in cli_rpc_pipe_open_noauth (cli=0x844ea58, pipe_idx=0, perr=0x8046918) at rpc_cli
#9  0x0837221b in libnet_join_joindomain_rpc (mem_ctx=0x844b010, r=0x844b4c8) at libnet/libnet_jo
#10 0x083736a5 in libnet_DomainJoin (mem_ctx=0x844b010, r=0x844b4c8) at libnet/libnet_join.c:1500
#11 0x08373844 in libnet_Join (mem_ctx=0x844b010, r=0x844b4c8) at libnet/libnet_join.c:1545
#12 0x0809c7d6 in net_ads_join (argc=0, argv=0x8444654) at utils/net_ads.c:1186
#13 0x08097cc4 in net_run_function (argc=1, argv=0x8444650, table=0x8046b30, usage_fn=0x8099a80 <
#14 0x0809e99b in net_ads (argc=1, argv=0x8444650) at utils/net_ads.c:2199
#15 0x08097cc4 in net_run_function (argc=2, argv=0x844464c, table=0x84363f0, usage_fn=0x809ef10 <
#16 0x080999c8 in main (argc=7, argv=0x8047028) at utils/net.c:1168
(gdb)
Comment 18 Volker Lendecke 2008-03-10 09:42:22 UTC
Jeremy, this looks like yours :-)

Volker
Comment 19 Jeremy Allison 2008-03-10 10:04:04 UTC
Ah, looks like solaris libc doesn't support the "%*s" syntax in printf. Bugger.
Bloody Sun and their medieval libraries. Removing statics did it :-(. I'll look into a portable construct, but will probably mean more malloc :-(.
Jeremy.
Comment 20 Volker Lendecke 2008-03-14 17:30:01 UTC
Created attachment 3173 [details]
Potential fix for the crash

Can you try the attached patch and re-run the debug level 10 log? This is just a wild guess what might cause the crash.

Thanks,

Volker
Comment 21 Jeremy Allison 2008-03-14 17:43:17 UTC
Volker, if this doesn't fix it the "correct" fix will have to be a configure test the finds broken libc's that don't support the "%*s" format string and for loop to emit the right number of ' ' characters on those systems. It'll be horribly slow, but tab_depth() is only called at debug level 5 or greater.

Jeremy.

Comment 22 Volker Lendecke 2008-03-15 03:51:00 UTC
I know, but the crash I saw on my Solaris box was different: Our replacement of asprintf returns NULL in the target string when the returned string size is 0. Not good, I think it should at least allocate the trailing 0, at least format_debug_msg and quite a bit of other code depends on this. The reported crash might also be it, at least it is a very similar call stack.

Volker
Comment 23 mchugh19@yahoo.com 2008-03-16 15:18:06 UTC
Looks like the patch solved the segfault. I'll attach the output of failing net commands debug output.
Comment 24 mchugh19@yahoo.com 2008-03-16 15:19:33 UTC
Created attachment 3176 [details]
/usr/local/samba/bin/net ads join -U mmchugh -d 10 2> /tmp/samba.out

Debug output from samba 3.2-pre2's net command. Domain join fails, where 3.0.28 worked.
Comment 25 Volker Lendecke 2008-03-18 03:30:25 UTC
Günther, to me it looks as if at least libnet_join_set_error_string() does not work as expected, otherwise we should get a bit more error information....

Volker
Comment 26 Guenther Deschner 2008-03-18 05:26:04 UTC
Ok, thanks for the info. But we still need more :) Can you re-run the net command with the "-v" switch added ? And please provide a network sniff if possible.

Thanks
Comment 27 mchugh19@yahoo.com 2008-03-18 09:06:21 UTC
Created attachment 3188 [details]
Output of /usr/local/samba/bin/net ads join -U mmchugh -d 10 -v 2> /tmp/samba.out
Comment 28 mchugh19@yahoo.com 2008-03-18 09:07:10 UTC
Created attachment 3189 [details]
Wireshark capture of previous net command

Let me know if you need anything else
Comment 29 Guenther Deschner 2008-03-18 16:11:41 UTC
Are you using a non-OpenLDAP LDAP library ?
Comment 30 Guenther Deschner 2008-03-18 17:58:21 UTC
The core of your problems is btw. that your host (egr214-01) does not resolve to a full qualified domain name locally.
Comment 31 mchugh19@yahoo.com 2008-03-19 09:03:57 UTC
Samba will not compile against the sun ldap libraries, so I've had to install the openldap bits. So yes, it should be compiled against openldap.

I find the hostname solution confusing. This process worked fine with previous samba versions before 3.2. Furthermore, /etc/hosts contains
<ip addr>    egr214-01 egr214-01.egr.nau.edu

and it still does not work.

As an additional note, I must still remove all instances of "-z text" from the Makefile after the configure to get samba to build on solaris. 
Comment 32 mchugh19@yahoo.com 2008-03-24 10:22:05 UTC
Is there anything else I can provide to debug this problem? I am still unable to join to active directory with net ads or net rpc. 
Comment 33 Guenther Deschner 2008-03-31 05:46:08 UTC
(In reply to comment #32)
> Is there anything else I can provide to debug this problem? I am still unable
> to join to active directory with net ads or net rpc. 

Yes, please post a full trace (the one you uploaded just contained SMB traffic, the "net ads join" failure results from an LDAP error though).

Thanks
Comment 34 mchugh19@yahoo.com 2008-03-31 18:29:44 UTC
After running a git-fetch for the samba-3.2 checkout, this problem now appears solved. However, we are running two domains, and I am unable to perform lookups on the one I am not joined to. Does this require a new bugzilla entry?
Comment 35 Guenther Deschner 2008-04-01 03:21:18 UTC
(In reply to comment #34)
> After running a git-fetch for the samba-3.2 checkout, this problem now appears
> solved.

Great.

> However, we are running two domains, and I am unable to perform lookups
> on the one I am not joined to. Does this require a new bugzilla entry?

Yes. Please open a new one on that.

Thanks for the report.
Comment 36 David S. Collier-Brown 2008-04-01 12:45:31 UTC
[Try number 2]

I've used %*s for padding and truncation since
SunOS 3, and it works in test as follows:

  char s[] = "text";
  (void) printf("x='%*s'\n", 10, s);
  
which produces

   x='      text'

Are you sure there wasn't a typo in the
original? The following segfaults,
as one would expect (;-))

    char s = "text";
   (void) printf("x='%*s'\n", 10, s);