There's a 4 k limit on security blobs in SessionSetupAndX commands. If you are a member of large number of groups (500), that limit is quickly exceeded. Windows fragments the security blob over several SessionSetupAndX commands. Samba just tries to send it all as one - which causes the Windows server to close the tcp connection. This is only really a problem when Samba is acting as the client in such a transaction, with an MS principal in a large number of groups (e.g. domain join). 2 sniffs are attached - one is the MS Windows behavior (winspnego) and the other is Samba (manygroups). I'm working on a fix for this - should be easy. Once tested, I'll submit it to the group.
Created attachment 2293 [details] Samba Sniff - see frame 740 for single AndX command.
Created attachment 2294 [details] Duplicate Windows sniff - see SMB frames 142 - 153
Thanks a lot the sniffs. I would be very surprised if Samba3 as a server would survive this correctly, but I might be wrong here. Investigating.... :-) Volker
We also need to be able to cope with this from a server-side of things. Not sure we do that. Jeremy.
*** Bug 4294 has been marked as a duplicate of this bug. ***
Created attachment 2295 [details] Client side patch handling fragmenting SessionSetupAndX
Thanks for the patch! Is there a reason why you use 4000 bytes,while in this capture https://bugzilla.samba.org/attachment.cgi?id=2294 windows use 4044 bytes? Does this limit is specific to SMB? Or specific to SPNEGO or KRB5 SSPI packages. So that RPC and LDAP authentification also need this fix?
4000 is fine - Windows will work with anything from 8 bytes to 4096 bytes. This is SMB specific behavior. Note it is only possible to get this behavior when using SPNEGO (e.g. if you choose Kerberos or NTLM directly w/o SPNEGO, the fragmentation behavior won't be present). With NTLM, that's no big deal - the auth token is not related to group membership. However, if there was a way to use Kerberos directly, you'd find that the 4k fragmentation code wouldn't work (in fact, I'm pretty sure that pure Kerberos GSSAPI calls into SMB aren't supported on Windows).
Metze - FYI I'm creating a server-side patch for this for SAMBA_3_0....
Created attachment 2297 [details] Server-side patch for SAMBA_3_0 Can someone who can quickly generate such a large PAC test this server-side patch for the SAMBA_3_0 or SAMBA_3_0_25 svn trees ? It should fix the concatenation problem for the server. In addition I think I've also fixed a memory leak in the error code paths for ASN1_DATA handling. Wonder why Coverity didn't find it... I don't have a quick setup for a massive PAC, but can try doing this tues (mon is a holiday here in the USA). Jeremy.
Created attachment 2298 [details] Replacement patch. Once more with just an addition of a little integer-wrap paranoia :-). Jeremy.
Created attachment 2299 [details] 2nd replacement Third time's the charm - one more piece of integer wrap paranoia. Jeremy.
Created attachment 2300 [details] Final version :-) Final server-side patch, fixing all error paths that could leak memory (not introduced by me :-), making the function static etc. etc. This is the version I'm going to test and commit if it passes. Jeremy.
Created attachment 2301 [details] Never say never... In my memleak fix for ASN1_DATA structs I neglected to set data->nested = NULL, meaning the codepaths calling asn1_free() twice on the same struct (and there are some...) would double free. Jeremy.
Created attachment 2302 [details] Next version. Get the logic right (count the number of bytes we still need correctly). Patch should work in a fragmented situation now. I still need to add DOS protection and restrict the max. number of outstanding bytes (probably 1mb). Jeremy.
I'd recommend a maximum of 65k - that's the LARGEST kerberos SPNEGO blob you'll ever see from a MS client. Let me know when you're happy with the patch, and I'll apply it in-house @ isilon, since QA is really hammering on this code in the next couple of weeks with large groups.
It should be functionally ready to test right now (I'm trying to set up a test env. for it). Let me know if it works - once that's confirmed I'll add in the DOS limits. Jeremy.
How do you create a test for this ? I've created a test user in 1000 groups, and when I try and log on as that user from an XP box so I can watch the sessionsetupX packets I get the error message : "Unable to log you on due to the following error : During a logon attempt, the user's security context accumulated too many security ID's". WTF ? Do I have to *guess* what the group limit is in Windows ? (every day, and in every way, I hate Windows more and more and more and.... :-).
Ok, I managed to reproduce with 500 groups.... continuing to test :-).
Ok, I'm now officially confused. I've reproduced a WinXP client in 500 groups getting the krb5 ticket for a Win2k3 server, and then doing the 4k splintered sessionsetupX security blob against the Win2k3 server. Now when I try mount a share *as the same user* to a Samba 3.0.25 server without my patch (as I wanted to see it fail first) it happily sends the krb5 ticket in a security blob of size 5246 bytes. I have the sniff to prove it.... Everything "just works" (tm) without need for this patch at all. Is the 4k restriction only for Windows servers ? Or only for Vista clients ? How does the client describe whether to splinter a blob ? Jeremy.
Created attachment 2303 [details] XP to Samba big blob - see frame 236 for the krb5 spnego blob size 5246
I bet it's the "max buffer size" response in the negprot. We (Samba) set this to 16644, Win2K3R2 sets it to 4356. I'll try again setting this to the same value as w2k3R2.
Yep - confirmed it - that causes fragmentation. So it's server configurable.
Ah - my patch fails as I'm associating the pending blob with the vuid, which is still zero in subsequent sessionsetupX calls. I need to associate it with the incoming processid, which is constant across subsequent sessionsetupX calls.
Created attachment 2304 [details] Working server-side patch ! W00t! This one works with fragmented SPNEGO ! Now to add in limits and client-side version. Jeremy.
Jeremy, just for the record as I was testing with the patches a little and - what a surprise - came to the same conclusion as you: * XP login fails with NT_STATUS_TOO_MANY_CONTEXT_IDS, and with default settings SPNEGO krb5 session setup always works server-side (without the server-side patch) * Todds client side patch does not work for me
I was working against 23c code, so I may have messed up the client patch when building it against 24/5 (e.g. works for me ;). When you say "doesn't work" - is that compile time or run time? Too many context ids means you're building a token w/ > 1000 groups - don't forget that the domain groups + machine local groups + computed groups (e.g. authenticated users) must be less than 1000. So, if you are a member of 1000 domain groups already, then all bets are off...
Todd, the client issue is that we're ignoring the server max xmit reply in the negprot reply, so setting a fixed size for the maximum security blob isn't correct. If we did that we'd break against earlier Samba servers that don't have the server patch. I've committed the server-side patch into SAMBA_3_0 and SAMBA_3_0_25 so you should be able to test server code with that (it works for me at home). I'll start to merge your patch (with some changes) into the client code later this week (although I'm off to FOSDEM on thursday). We'll try and get this in for 3.0.25. Thanks for the info on the SID limit - wish this stuff was written down somewhere (other than our code :-). Jeremy.
Created attachment 2312 [details] Client side patch Todd, can you check this patch ? I think it's a little simpler as it doesn't require changes to the struct cli_state. Once you've confirmed it works I'll check it in. Thanks, Jeremy.
Ok, I've tested the client side patch against W2K3R2 and it seems to work fine so I've checked it into the svn. Todd, please confirm and close the bug if this now fixes this for you. Thanks, Jeremy.
No response, closing bug.