A while back I set up a Linux box (SUSE 9.2) to authenticate (using kerberos) against a w2k3 AD domain. A nice side effect of this was that I could use "smbclient -k" and save typing in my password again. The other day, I found that "smbclient -k" no longer worked. Basic kerberos login was still fine (i.e. kinit worked, PAM kerberos integration still good) Investigating further, I went over to a fresh SuSE 10.1 installation and upgraded it to the latest Samba release (3.0.23d). I then followed the steps in the main HOWTO. Still no dice - this is what happens: xx@xxx:~/xxxxx> smbclient -k -d 4 //asl4/xxxxx lp_load: refreshing parameters Initialising global parameters params.c:pm_process() - Processing configuration file "/etc/samba/smb.conf" Processing section "[global]" doing parameter workgroup = ASL-LAN doing parameter printing = cups doing parameter printcap name = cups doing parameter printcap cache time = 750 doing parameter cups options = raw doing parameter map to guest = Bad User doing parameter include = /etc/samba/dhcp.conf params.c:pm_process() - Processing configuration file "/etc/samba/dhcp.conf" doing parameter wins server = eth0:192.168.102.12 eth0:192.168.202.5 doing parameter logon path = \\%L\profiles\.msprofile doing parameter logon home = \\%L\%U\.9xprofile doing parameter logon drive = P: doing parameter usershare allow guests = Yes doing parameter client use spnego = yes doing parameter password server = asl4.asl.lan doing parameter realm = ASL.LAN doing parameter security = ADS pm_process() returned Yes added interface ip=192.168.102.91 bcast=192.168.102.255 nmask=255.255.255.0 Client started (version 3.0.23d-5.1.39-1084-SUSE-CODE10). resolve_lmhosts: Attempting lmhosts lookup for name asl4<0x20> getlmhostsent: lmhost entry: 127.0.0.1 localhost resolve_wins: Attempting wins lookup for name asl4<0x20> wins_srv_is_dead: 192.168.102.12 is alive wins_srv_is_dead: 192.168.102.12 is alive resolve_wins: using WINS server 192.168.102.12 and tag 'eth0' nmb packet from 192.168.102.12(137) header: id=18191 opcode=Query(0) response=Yes header: flags: bcast=No rec_avail=Yes rec_des=Yes trunc=No auth=Yes header: rcode=0 qdcount=0 ancount=1 nscount=0 arcount=0 answers: nmb_name=ASL4<20> rr_type=32 rr_class=1 ttl=0 answers 0 char `...f. hex 6000C0A8660C Got a positive name query response from 192.168.102.12 ( 192.168.102.12 ) Connecting to 192.168.102.12 at port 445 session request ok Doing spnego session setup (blob length=101) got OID=1 2 840 48018 1 2 2 got OID=1 2 840 113554 1 2 2 got OID=1 2 840 113554 1 2 2 3 got OID=1 3 6 1 4 1 311 2 2 10 got principal=asl4$@ASL.LAN Doing kerberos session setup ads_cleanup_expired_creds: Ticket in ccache[FILE:/tmp/krb5cc_1001] expiration Mon, 11 Dec 2006 21:17:50 GMT read_socket_with_timeout: timeout read. read error = Connection reset by peer. SPNEGO login failed: NT_STATUS_INVALID_NETWORK_RESPONSE session setup failed: Read error: Connection reset by peer In essence, the server "asl4" (which is the w2k3 server) appears to close the connection and kick me off. However, it has granted me a ticket - as shown by klist: Ticket cache: FILE:/tmp/krb5cc_1001 Default principal: xx@ASL.LAN Valid starting Expires Service principal 12/11/06 11:19:15 12/11/06 21:17:50 krbtgt/ASL.LAN@ASL.LAN renew until 12/12/06 11:19:15 12/11/06 11:19:08 12/11/06 21:17:50 asl4$@ASL.LAN renew until 12/12/06 11:19:15 Using smbclient in the traditional way (supplying a username and password) works perfectly. I assume that some recent win2k3 patch or update has changed things, because I used to have a working system - but I haven't seen anyone else posting a similar problem. Attempting to add the machine to the domain with "net ads join" also fails with the same symptoms - the server closes the connection just after "Doing kerberos session setup" I'm very happy to run further tests, gather more information, etc. - just need a pointer as to where to look next!
Some more information: Running smbclient with a higher debug lebel yields the following: Got KRB5 session key of length 16 Mandatory SMB signing enabled! SMB signing enabled! cli_simple_set_signing: user_session_key [000] C6 33 40 99 6C 5C 58 95 B5 E9 80 F6 27 17 D1 B0 .3@.l\X. ....'... cli_simple_set_signing: NULL response_data simple_packet_signature: sequence number 0 client_sign_outgoing_message: sent SMB signature of [000] 4C 4C F6 1C 70 FA 84 92 LL..p... store_sequence_for_reply: stored seq = 1 mid = 2 write_socket(6,16958) write_socket(6,16958) wrote 16958 read_socket_with_timeout: timeout read. read error = Connection reset by peer. receive_smb_raw: length < 0! client_receive_smb failed It strikes me as unusual that the call to write_socket is writing 16958 bytes of data. Googling about for other log files, usually this write is about 1/10th the size. So we have a very large write on the socket, followed by the windows server closing the connection. Perhaps there is a link?
An additional consequence of this situation is that the libsmbclient setting to "fallback to NTLM if kerberos fails" doesn't work, since the failure of the krb authentication causes the connection to fail, and the library code assumes (not unreasonably) that the TCP connection is still up if krb5 authentication hasn't succeeded. This has the knock-on effect of breaking the smb:// kio-slave in recent KDEs.
We have found (I think) the underlying issue. My user account on the AD server is a member of a large number of groups. This makes the token size very large, and it gets fragmented (I suspected something like this in my comment #1) We worked this out because a similar issue surfaced server-side, which we were able to fix by changing "max xmit = 65535" in smb.conf. However, libsmbclient does not look at this setting. Would it be possible to either (a) make libsmbclient honour "max xmit" or (b) create a new "client max xmit" option in smb.conf?
You are talking about using the smbclient tool, but then you reference libsmbclient. The smbclient tool does not use libsmbclient. Please confirm that you are seeing this issue with the smbclient tool and not with an application which links with libsmbclient. If so, please change the "component" in this report to "Client Tools" (above) and "Reassign bug to default assignee" (below) so that this report is redirected to the correct people.
I am seeing the issue both with smbclient *and* libsmbclient (more precisely, the smb:// protocol in Konqueror which uses libsmbclient) and presume that they are both doing the same thing (fragmenting the outgoing packet.) Should I split this into two bugs?
No, don't bother with a separate bug I have a few libsmbclient issues (now including this one) to address and expect to get to them soon (this week, if I'm lucky). The smbclient tool issue will likely be handled by someone else, so when I'm finished with libsmbclient chnages, I'll pass it off rather than closing the bug.
It looks like libsmbclient is already using a 128K buffer for reading, so setting "max xmit" to 64k, even if it were used by libsmbclient, would not solve the problem. I suspect, however, that a different bug I just fixed may be responsible for this problem. I see that the read_socket_with_timeout is returning a "connection reset by peer" error. It is possible that this occurred due to libsmbclient improperly sending a netbios keepalive packet which causes the server to shut down the connection. We know that Vista shuts down the connection upon receiving this packet. Older versions appear to just ignore it. I don't know what W2k3 does with it. Please test latest svn and let me know if anything is different. Unfortunately, I don't have an environment set up to be able to properly locate this problem. :-( Derrell
(In reply to comment #7) > It looks like libsmbclient is already using a 128K buffer for reading, so > setting "max xmit" to 64k, even if it were used by libsmbclient, would not > solve the problem. Do you mean "128K buffer for _writing_"? The problem isn't the read buffer. I suspect it's the write buffer. > > I suspect, however, that a different bug I just fixed may be responsible for > this problem. I see that the read_socket_with_timeout is returning a > "connection reset by peer" error. It is possible that this occurred due to > libsmbclient improperly sending a netbios keepalive packet which causes the > server to shut down the connection. I doubt it, because other users here who are members of fewer groups (and thus need to send a smaller token) don't experience the problem. Unless you only send the netbios keepalive packet after big writes?
Looking at the 3.0.24 SVN source code I notice that the routine that sends the packet that results in the server disconnection is cli_session_setup_blob_send(). This routine, unlike some of the other cli_*_send routines (e.g. cli_list_new()) , does not check against the cli->max_xmit value that has been previously set up in the session negotiation. In other words, possibly the win2k3 server has already told us "don't send packets bigger than X" and we haven't obeyed the rules in this instance because normally the packet sent by cli_session_setup_blob_send() is nowhere near the typical maximum xmit.
Would you please provide a packet capture of the problem with tcpdump -s 0 -w capture.pcap That should help isolate the source of the problem.
Created attachment 2267 [details] tcpdump captured with tcpdump -s 0 -w dump.dmp Here you are. This was taken while trying an "smbclient -k" (with a valid kerberos ticket)
Jeremy, Jerry: I'm in over my head here. Does the attached packet capture help to discover this problem? If you can figure out what the problem is with smbclient, and it's something that needs to be set by the client software, I can then make a similar change in libsmbclient. Thanks for your help. Derrell
(In reply to comment #3) > We have found (I think) the underlying issue. > > My user account on the AD server is a member of a large number of groups. Just to confirm - I have checked with other people logging into the same system, and this is now confirmed: If the user is a member of a large number of groups on the AD server, kerberos authentication fails for both smbclient -k and KDE's smb:// KIO slave. If the user is a member of a "normal" number of groups, then both smbclient -k and smb:// work perfectly.
What does a normal kinit return on your box ? Can this get a tgt from the AD server ?
(In reply to comment #14) > What does a normal kinit return on your box ? Can this get a tgt from the AD > server ? > Yes, kinit succeeds without a problem (see original bug description.) Having run kinit, "smbclient -k" *used to work* for me until (and I have now ascertained that this is the only thing that changed) my AD account gathered more group memberships. The closest I've got to probing this myself is in my comment #9.
The cause of this has been identified, the issue will get addressed in #4400. *** This bug has been marked as a duplicate of 4400 ***