Created attachment 11907 [details] Original proposed patch Date: Wed, 9 Mar 2016 16:11:05 +1300 From: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> To: Andrew Bartlett <abartlet@samba.org>, Jeremy Allison <jra@samba.org> Cc: samba-technical@lists.samba.org Subject: Re: mkdir-dup test flapping User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 [-- Attachment #1 [details] --] [-- Type: text/plain, Encoding: 7bit, Size: 3.3K --] We looked at this some more, and Andrew seemed to understand and wrote the attached patch. > > We got the logs by forcing smbd to run with -d10 by patching > file_server/fileserver.c. The issue appears to be in this call: 3638 /* Ensure there was no race condition. */ 3639 if (!check_same_stat(&smb_dname->st, &fsp->fsp_name->st)) { 3640 samba_start_debugger(); 3641 DEBUG(5,("open_directory: stat struct differs for " 3642 "directory %s.\n", 3643 smb_fname_str_dbg(smb_dname))); 3644 fd_close(fsp); (gdb) p smb_dname->st $1 = {st_ex_dev = 64512, st_ex_ino = 44701075, st_ex_mode = 16877, st_ex_nlink = 2, st_ex_uid = 50133, st_ex_gid = 50133, st_ex_rdev = 0, st_ex_size = 0, st_ex_atime = {tv_sec = 1457488009, tv_nsec = 820161188}, st_ex_mtime = {tv_sec = 1457488009, tv_nsec = 820161188}, st_ex_ctime = { tv_sec = 1457488009, tv_nsec = 820161188}, st_ex_btime = { tv_sec = 1457488009, tv_nsec = 820161188}, st_ex_calculated_birthtime = true, st_ex_blksize = 4096, st_ex_blocks = 8, st_ex_flags = 0, st_ex_mask = 0} (gdb) p fsp->fsp_name->st $2 = {st_ex_dev = 64512, st_ex_ino = 44701075, st_ex_mode = 16877, st_ex_nlink = 2, st_ex_uid = 3000000, st_ex_gid = 65531, st_ex_rdev = 0, st_ex_size = 0, st_ex_atime = {tv_sec = 1457488009, tv_nsec = 820161188}, st_ex_mtime = {tv_sec = 1457488009, tv_nsec = 820161188}, st_ex_ctime = { tv_sec = 1457488009, tv_nsec = 820161188}, st_ex_btime = { tv_sec = 1457488009, tv_nsec = 820161188}, st_ex_calculated_birthtime = true, st_ex_blksize = 4096, st_ex_blocks = 8, st_ex_flags = 0, st_ex_mask = 0} (gdb) where you can see the st_ex_uid and st_ex_gid fields differ: douglasb@douglasb-desktop:~/src/samba (flap *)$ bin/wbinfo --uid-info=50133 ADDOMAIN/administrator:*:50133:65531::/home/ADDOMAIN/administrator:/bin/false douglasb@douglasb-desktop:~/src/samba (flap *)$ bin/wbinfo --uid-info=3000000 BUILTIN/administrators:*:3000000:3000000::/home/BUILTIN/administrators:/bin/false The issue is that after the mkdir, we run: } else if (lp_inherit_acls(SNUM(conn))) { /* Inherit from parent. Errors here are not fatal. */ status = inherit_new_acl(fsp); if (!NT_STATUS_IS_OK(status)) { while in the other process that is doing the open of the existing directory, we do: if (SMB_VFS_LSTAT(conn, smb_dname) == -1) { DEBUG(2, ("Could not stat " "directory '%s' just " "opened: %s\n", smb_fname_str_dbg( smb_dname), strerror(errno))); return map_nt_error_from_unix( errno); } And then only a few lines later we do: !check_same_stat(&smb_dname->st, &fsp->fsp_name->st)) { The inherit_new_acl() can change the owner of the directory in the time it takes between the first stat() and the second one. This is seen only in the AD DC because of special logic that makes BUILTIN\Administrators the owner of files created by any administrative user. We need to decide if the change in owner is enough reason to refuse the open(), or if we should just continue. On the basis that the owner change is not significant, we reworked the code to do the lstat() after the fstat(), and not compare the owner. Another way to avoid the flapping would be to run this test as someone other than Administrator. Douglas and Andrew
Created attachment 11908 [details] git-am fix for master.
Created attachment 11910 [details] git-am fix for 4.4.0, 4.3.next, 4.2.next. Cherry-pick from master for 4.4.0, 4.3.next, 4.2.next.
Comment on attachment 11910 [details] git-am fix for 4.4.0, 4.3.next, 4.2.next. LGTM
Reassigning to Karolin for inclusion in 4.2.next, 4.3.next, 4.4.0.
(In reply to Uri Simchoni from comment #4) Pushed to v4-4-test (incl. in rc5) and autobuild-v4-[2|3]-test.
(In reply to Karolin Seeger from comment #5) Landed in v4-2-test, but not in v4-3-test yet, re-trying.
Pushed to both branches. Closing out bug report. Thanks!