Bug 8736 - Failing user permissions in session (unless first time in a minute)
Summary: Failing user permissions in session (unless first time in a minute)
Status: RESOLVED FIXED
Alias: None
Product: Samba 3.6
Classification: Unclassified
Component: File services (show other bugs)
Version: 3.6.1
Hardware: All All
: P5 normal
Target Milestone: ---
Assignee: Volker Lendecke
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-02-01 16:43 UTC by Wilco Baan Hofman
Modified: 2012-03-01 12:26 UTC (History)
1 user (show)

See Also:


Attachments
log file for good access to directory (335.93 KB, application/octet-stream)
2012-02-06 11:34 UTC, Wilco Baan Hofman
no flags Details
log file for bad access to directory (103.53 KB, application/octet-stream)
2012-02-06 11:35 UTC, Wilco Baan Hofman
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Wilco Baan Hofman 2012-02-01 16:43:08 UTC
I have a share with hide unreadable = yes.

On this share I have project directories, which are visible according to the users group memberships.


~# ./failsmb 
Domain=[ANDOLAN] OS=[Unix] Server=[Samba 3.6.1]
  .                                   D        0  Tue Dec 13 11:58:00 2011
  ..                                  D        0  Mon Jan 16 11:54:54 2012
  Design Studies                      D        0  Mon Oct 17 22:48:32 2011
    >-snip-<
  scans                               D        0  Tue Jan 31 14:33:08 2012

		51637 blocks of size 33553920. 23513 blocks available
~# ./failsmb 
Domain=[ANDOLAN] OS=[Unix] Server=[Samba 3.6.1]
NT_STATUS_ACCESS_DENIED listing \SB IT\*

~# cat failsmb
(printf 'cd \"SB IT\"\n';printf 'ls\n')|smbclient //server/overzicht -U 'ecopy%password'

Seems like the group memberships are not cached properly, because I can still see the initial directory list in the root directory.


I'll do some more debugging tonight
Comment 1 Wilco Baan Hofman 2012-02-01 21:32:30 UTC
Okay, apparently the critical time is exactly 120 seconds. So need to look for something that times out in exactly 120 seconds.
Comment 2 Wilco Baan Hofman 2012-02-02 00:42:39 UTC
Damn, problem is not as easily reproducible anymore after a samba restart. At least I now have debugging symbols on this server. 

It seems that the unix user token gets corrupted..

When good:
[2012/02/02 01:12:39.038017, 10] auth/token_util.c:527(debug_unix_user_token)
  UNIX token of user 10202
  Primary group is 10023 and contains 11 supplementary groups
  Group[  0]: 4294967295
  Group[  1]: 10003
  Group[  2]: 10064
  Group[  3]: 10065
  Group[  4]: 10067
  Group[  5]: 10068
  Group[  6]: 10070
  Group[  7]: 10071
  Group[  8]: 10072
  Group[  9]: 10073
  Group[ 10]: 10111

When bad:
[2012/02/02 01:07:54.515313, 10] auth/token_util.c:527(debug_unix_user_token)
  UNIX token of user 10202
  Primary group is 10023 and contains 2 supplementary groups
  Group[  0]: 4294967295
  Group[  1]: 10111

I'm just not sure what triggers it. I just know that it resets every 120 seconds and it seems to jump between various users (although consistent when not restarting samba).
Comment 3 Wilco Baan Hofman 2012-02-02 00:58:29 UTC
I've also noticed that if samba 3.6.1 runs for a while, it starts rejecting authentications from workstation accounts.. a restart fixes that as well..

Looks like these problems may well be related. Server has debugging symbols now and as soon the problems come back and I get it reproducible again I will attach gdb..
Comment 4 Wilco Baan Hofman 2012-02-06 11:34:44 UTC
Created attachment 7295 [details]
log file for good access to directory
Comment 5 Wilco Baan Hofman 2012-02-06 11:35:03 UTC
Created attachment 7296 [details]
log file for bad access to directory
Comment 6 Wilco Baan Hofman 2012-02-06 11:36:28 UTC
I can't still reproduce this for now.. these accounts seem to be working now. Can anyone look at the log files to see what I've missed? 

I already tried disabling the stat cache, that did not help.
Comment 7 Wilco Baan Hofman 2012-02-06 13:21:10 UTC
  Group[  0]: 4294967295
This should be group 10000, "Domain users" -> rid 513.

Wrong in both cases apparently.
Comment 8 Wilco Baan Hofman 2012-02-06 16:08:04 UTC
Looks like the 2^32-1 group issue is not samba related, but related to corrupt indexes in openldap. Doesn't fully explain the 120 seconds though. I'll keep this bug updated.
Comment 9 Wilco Baan Hofman 2012-03-01 12:26:35 UTC
Apparently, this was a LDAP index issue. It has not occured again.