Bug 3912 - Kernel Oplocks problem
Kernel Oplocks problem
Status: RESOLVED FIXED
Product: Samba 3.0
Classification: Unclassified
Component: File Services
3.0.23a
Other Linux
: P3 normal
: none
Assigned To: Jeremy Allison
Samba QA Contact
:
: 3970 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2006-07-06 06:44 UTC by Daniel Beschorner
Modified: 2006-08-09 03:16 UTC (History)
2 users (show)

See Also:


Attachments
Level 10 log (355.53 KB, application/octet-stream)
2006-07-06 07:02 UTC, Daniel Beschorner
no flags Details
tcpdump (130.61 KB, application/octet-stream)
2006-07-06 07:02 UTC, Daniel Beschorner
no flags Details
Level 10 log with kernel oplocks off (182.60 KB, application/x-gzip)
2006-07-14 11:32 UTC, Daniel Beschorner
no flags Details
tcpdump no kernel oplocks (484.93 KB, application/x-gzip)
2006-07-24 11:19 UTC, Daniel Beschorner
no flags Details
level 10 log w/o oplocks (220.40 KB, application/x-gzip)
2006-07-24 11:20 UTC, Daniel Beschorner
no flags Details
tcpdump kernel oplocks (277.08 KB, application/x-gzip)
2006-07-24 11:20 UTC, Daniel Beschorner
no flags Details
level 10 log w oplocks (92.24 KB, application/x-gzip)
2006-07-24 11:21 UTC, Daniel Beschorner
no flags Details
strace log (227.61 KB, text/plain)
2006-07-25 06:50 UTC, Daniel Beschorner
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Beschorner 2006-07-06 06:44:36 UTC
We have a problem with an application that clears the archive bit before
writing and sets it after writing.
The latter one doesn't succeed if the writer != file owner.
Dos filemode is enabled and manually setting is fine.
Samba is 3.0.23RC3.
Client is XP/SP2.
Seems to loose the connection before setting the bit and a new smbd is
spawn.
Comment 1 Daniel Beschorner 2006-07-06 07:02:34 UTC
Created attachment 2017 [details]
Level 10 log
Comment 2 Daniel Beschorner 2006-07-06 07:02:52 UTC
Created attachment 2018 [details]
tcpdump
Comment 3 Daniel Beschorner 2006-07-14 09:53:26 UTC
With kernel oplocks = no the bug doesn't show up at all!
Should there be something really broken in Samba/Linux??
Comment 4 Daniel Beschorner 2006-07-14 11:32:46 UTC
Created attachment 2037 [details]
Level 10 log with kernel oplocks off

now it works
Comment 5 Gerald (Jerry) Carter 2006-07-20 13:26:18 UTC
Jeremy, please close if this is fixed in 3.0.23a
Comment 6 Jeremy Allison 2006-07-21 11:18:41 UTC
I'm closing this as I fixed a race condition bug with 3.0.23a. Please retest with this release and reopen if you can reproduce.
Jeremy.
Comment 7 Daniel Beschorner 2006-07-24 03:33:19 UTC
Unfortunately not fixed with 3.0.23a.
Comment 8 Daniel Beschorner 2006-07-24 06:41:05 UTC
BTW, as soon as I activated kernel oplocks for testing with 3.0.23a I saw again "lost delayed write data" on saving.
Linux/x86 is 2.6.17.
Comment 9 Volker Lendecke 2006-07-24 08:07:49 UTC
I looked at your log files/sniff a couple of days ago, but I could not detect any failure. In particular, in the sniff there is only one SET_FILE_INFO call that appears to reset the archive bit (frame 320), none that sets it afterwards. Can you point me at the specific line/frame number that fails?

Alternatively, please upload a full smbd debug level 10 log and a full sniff from starting the smbd connection to the failure, please in both cases with and without kernel oplocks.

Volker
Comment 10 Daniel Beschorner 2006-07-24 11:17:52 UTC
Ok, here are 2 complete tcpdumps with(out) kernel oplocks from loading to saving (it's an InstallShield projekt).
2 corresponding level 10 logs, the no-kernel-oplock log misses it's first part due log rotation, hopefully it's not important.

With kernel oplocks the client complains "no connection to share" on save, the last line in the log shows a new connection.

user is cad, client DELLDEV4, file is Facton5.x.ism.

Daniel
Comment 11 Daniel Beschorner 2006-07-24 11:19:25 UTC
Created attachment 2056 [details]
tcpdump no kernel oplocks
Comment 12 Daniel Beschorner 2006-07-24 11:20:07 UTC
Created attachment 2057 [details]
level 10 log w/o oplocks
Comment 13 Daniel Beschorner 2006-07-24 11:20:47 UTC
Created attachment 2058 [details]
tcpdump kernel oplocks
Comment 14 Daniel Beschorner 2006-07-24 11:21:17 UTC
Created attachment 2059 [details]
level 10 log w oplocks
Comment 15 Volker Lendecke 2006-07-25 02:36:03 UTC
Ok, thanks. These logs are more helpful. Although now I see something very strange: in the "oplock" case at the very end of the sniff after the write call in packet 4206 you can see the server to end the connection, and the client restart it. In the corresponding logfile you can see the write attempt but *nothing* after that. If we get a signal that somehow gives us a chance to panic reasonably, then  I would expect a panic message in the log, but there is really only the reconnect by the client.

This makes me assume that we get a KILL signal from the kernel for some reason.

To verify this, the next step is an strace of smbd. Can you start your smbd with

strace -f -ttT -o /tmp/strace.smbd /usr/sbin/smbd -D

and re-run the test with kernel oplocks on? Can you also upload the logfile and sniff for that?

BTW, what exact system and kernel version are you using?

Thanks,

Volker
Comment 16 Daniel Beschorner 2006-07-25 06:50:48 UTC
Created attachment 2060 [details]
strace log
Comment 17 Daniel Beschorner 2006-07-25 06:52:01 UTC
I quickly took a strace log only of the affected smbd.
Kernel is 2.6.17, SuSE 9.1
Comment 18 Volker Lendecke 2006-07-25 07:08:50 UTC
Quick shot: Is it possible that you have async I/O enabled? Can you try disabling it?

Volker
Comment 19 Daniel Beschorner 2006-07-25 07:18:43 UTC
I don't think so.

Build environment:
   Built by:    root@server
   Built on:    Sun Jul 23 22:50:05 CEST 2006
   Built using: gcc
   Build host:  Linux server 2.6.17 #27 Mon Jun 19 17:49:15 CEST 2006 i686 i686 i386 GNU/Linux
   SRCDIR:      /server/build/samba-3.0.23a/source
   BUILDDIR:    /server/build/samba-3.0.23a/source

Paths:
   SBINDIR: /server/samba/sbin
   BINDIR: /server/samba/bin
   SWATDIR: /server/samba/swat
   CONFIGFILE: /server/samba/lib/smb.conf
   LOGFILEBASE: /server/samba/var
   LMHOSTSFILE: /server/samba/lib/lmhosts
   LIBDIR: /server/samba/lib
   SHLIBEXT: so
   LOCKDIR: /server/samba/var/locks
   PIDDIR: /server/samba/var/locks
   SMB_PASSWD_FILE: /server/samba/private/smbpasswd
   PRIVATE_DIR: /server/samba/private

 System Headers:
   HAVE_SYS_ACL_H
   HAVE_SYS_CDEFS_H
   HAVE_SYS_FCNTL_H
   HAVE_SYS_IOCTL_H
   HAVE_SYS_IPC_H
   HAVE_SYS_MMAN_H
   HAVE_SYS_MOUNT_H
   HAVE_SYS_PARAM_H
   HAVE_SYS_PRCTL_H
   HAVE_SYS_QUOTA_H
   HAVE_SYS_RESOURCE_H
   HAVE_SYS_SELECT_H
   HAVE_SYS_SHM_H
   HAVE_SYS_SOCKET_H
   HAVE_SYS_STATFS_H
   HAVE_SYS_STATVFS_H
   HAVE_SYS_STAT_H
   HAVE_SYS_SYSCALL_H
   HAVE_SYS_SYSLOG_H
   HAVE_SYS_SYSMACROS_H
   HAVE_SYS_TIME_H
   HAVE_SYS_TYPES_H
   HAVE_SYS_UIO_H
   HAVE_SYS_UNISTD_H
   HAVE_SYS_UN_H
   HAVE_SYS_VFS_H
   HAVE_SYS_WAIT_H
   HAVE_SYS_XATTR_H

 Headers:
   HAVE_AIO_H
   HAVE_ALLOCA_H
   HAVE_ARPA_INET_H
   HAVE_ASM_TYPES_H
   HAVE_ATTR_XATTR_H
   HAVE_CTYPE_H
   HAVE_DIRENT_H
   HAVE_DLFCN_H
   HAVE_EXECINFO_H
   HAVE_FCNTL_H
   HAVE_FLOAT_H
   HAVE_GLOB_H
   HAVE_GRP_H
   HAVE_INTTYPES_H
   HAVE_LANGINFO_H
   HAVE_LASTLOG_H
   HAVE_LBER_H
   HAVE_LDAP_H
   HAVE_LIMITS_H
   HAVE_LOCALE_H
   HAVE_MEMORY_H
   HAVE_MNTENT_H
   HAVE_NETINET_IN_SYSTM_H
   HAVE_NETINET_IP_H
   HAVE_NETINET_TCP_H
   HAVE_NET_IF_H
   HAVE_NSS_H
   HAVE_POLL_H
   HAVE_RPCSVC_NIS_H
   HAVE_RPCSVC_YPCLNT_H
   HAVE_RPCSVC_YP_PROT_H
   HAVE_RPC_RPC_H
   HAVE_SHADOW_H
   HAVE_STDARG_H
   HAVE_STDINT_H
   HAVE_STDLIB_H
   HAVE_STRINGS_H
   HAVE_STRING_H
   HAVE_STROPTS_H
   HAVE_SYSCALL_H
   HAVE_SYSLOG_H
   HAVE_TERMIOS_H
   HAVE_TERMIO_H
   HAVE_UNISTD_H
   HAVE_UTIME_H

 UTMP Options:
   HAVE_GETUTMPX
   HAVE_UTMPX_H
   HAVE_UTMP_H
   HAVE_UT_UT_ADDR
   HAVE_UT_UT_EXIT
   HAVE_UT_UT_HOST
   HAVE_UT_UT_ID
   HAVE_UT_UT_NAME
   HAVE_UT_UT_PID
   HAVE_UT_UT_TIME
   HAVE_UT_UT_TV
   HAVE_UT_UT_TYPE
   HAVE_UT_UT_USER
   PUTUTLINE_RETURNS_UTMP
   WITH_UTMP

 HAVE_* Defines:
   HAVE_ASPRINTF
   HAVE_ASPRINTF_DECL
   HAVE_ATEXIT
   HAVE_BACKTRACE_SYMBOLS
   HAVE_BER_SCANF
   HAVE_C99_VSNPRINTF
   HAVE_CHMOD
   HAVE_CHOWN
   HAVE_CHROOT
   HAVE_COMPILER_WILL_OPTIMIZE_OUT_FNS
   HAVE_CONNECT
   HAVE_CREAT64
   HAVE_CRYPT
   HAVE_DEVICE_MAJOR_FN
   HAVE_DEVICE_MINOR_FN
   HAVE_DIRENT_D_OFF
   HAVE_DLCLOSE
   HAVE_DLERROR
   HAVE_DLOPEN
   HAVE_DLSYM
   HAVE_DUP2
   HAVE_ENDMNTENT
   HAVE_ENDNETGRENT
   HAVE_ERRNO_DECL
   HAVE_EXECL
   HAVE_EXPLICIT_LARGEFILE_SUPPORT
   HAVE_FCHMOD
   HAVE_FCHOWN
   HAVE_FCNTL_LOCK
   HAVE_FCVT
   HAVE_FGETXATTR
   HAVE_FLISTXATTR
   HAVE_FOPEN64
   HAVE_FREMOVEXATTR
   HAVE_FSEEKO64
   HAVE_FSETXATTR
   HAVE_FSTAT
   HAVE_FSTAT64
   HAVE_FSYNC
   HAVE_FTELLO64
   HAVE_FTRUNCATE
   HAVE_FTRUNCATE64
   HAVE_FTRUNCATE_EXTEND
   HAVE_FUNCTION_MACRO
   HAVE_GETCWD
   HAVE_GETDIRENTRIES
   HAVE_GETGRENT
   HAVE_GETGRNAM
   HAVE_GETMNTENT
   HAVE_GETNETGRENT
   HAVE_GETRLIMIT
   HAVE_GETSPNAM
   HAVE_GETTIMEOFDAY_TZ
   HAVE_GETXATTR
   HAVE_GLOB
   HAVE_GRANTPT
   HAVE_ICONV
   HAVE_IFACE_IFCONF
   HAVE_IMMEDIATE_STRUCTURES
   HAVE_INITGROUPS
   HAVE_INNETGR
   HAVE_KERNEL_CHANGE_NOTIFY
   HAVE_KERNEL_OPLOCKS_LINUX
   HAVE_KERNEL_SHARE_MODES
   HAVE_LDAP
   HAVE_LDAP_ADD_RESULT_ENTRY
   HAVE_LDAP_DN2AD_CANONICAL
   HAVE_LDAP_INIT
   HAVE_LDAP_INITIALIZE
   HAVE_LDAP_SET_REBIND_PROC
   HAVE_LGETXATTR
   HAVE_LIBLBER
   HAVE_LIBLDAP
   HAVE_LIBRESOLV
   HAVE_LINK
   HAVE_LINUX_XFS_QUOTAS
   HAVE_LISTXATTR
   HAVE_LLISTXATTR
   HAVE_LLSEEK
   HAVE_LONGLONG
   HAVE_LREMOVEXATTR
   HAVE_LSEEK64
   HAVE_LSETXATTR
   HAVE_LSTAT64
   HAVE_MAKEDEV
   HAVE_MEMMOVE
   HAVE_MEMSET
   HAVE_MKNOD
   HAVE_MKTIME
   HAVE_MLOCK
   HAVE_MLOCKALL
   HAVE_MMAP
   HAVE_MUNLOCK
   HAVE_MUNLOCKALL
   HAVE_NANOSLEEP
   HAVE_NATIVE_ICONV
   HAVE_NL_LANGINFO
   HAVE_NO_AIO
   HAVE_OPEN64
   HAVE_PATHCONF
   HAVE_PIPE
   HAVE_POLL
   HAVE_POSIX_ACLS
   HAVE_PRCTL
   HAVE_PREAD
   HAVE_PREAD64
   HAVE_PUTUTLINE
   HAVE_PUTUTXLINE
   HAVE_PWRITE
   HAVE_PWRITE64
   HAVE_QUOTACTL_LINUX
   HAVE_RAND
   HAVE_RANDOM
   HAVE_READDIR64
   HAVE_READLINK
   HAVE_REALPATH
   HAVE_REMOVEXATTR
   HAVE_RENAME
   HAVE_ROOT
   HAVE_SECURE_MKSTEMP
   HAVE_SELECT
   HAVE_SENDFILE64
   HAVE_SETBUFFER
   HAVE_SETENV
   HAVE_SETGROUPS
   HAVE_SETLINEBUF
   HAVE_SETLOCALE
   HAVE_SETMNTENT
   HAVE_SETNETGRENT
   HAVE_SETPGID
   HAVE_SETRESGID
   HAVE_SETRESGID_DECL
   HAVE_SETRESUID
   HAVE_SETRESUID_DECL
   HAVE_SETSID
   HAVE_SETXATTR
   HAVE_SHMGET
   HAVE_SIGACTION
   HAVE_SIGBLOCK
   HAVE_SIGPROCMASK
   HAVE_SIGSET
   HAVE_SIG_ATOMIC_T_TYPE
   HAVE_SNPRINTF
   HAVE_SNPRINTF_DECL
   HAVE_SOCKLEN_T_TYPE
   HAVE_SRAND
   HAVE_SRANDOM
   HAVE_STAT64
   HAVE_STAT_HIRES_TIMESTAMPS
   HAVE_STAT_ST_ATIM
   HAVE_STAT_ST_BLKSIZE
   HAVE_STAT_ST_BLOCKS
   HAVE_STAT_ST_CTIM
   HAVE_STAT_ST_MTIM
   HAVE_STRCASECMP
   HAVE_STRCHR
   HAVE_STRDUP
   HAVE_STRERROR
   HAVE_STRFTIME
   HAVE_STRNDUP
   HAVE_STRNLEN
   HAVE_STRPBRK
   HAVE_STRSIGNAL
   HAVE_STRTOUL
   HAVE_STRUCT_DIRENT64
   HAVE_STRUCT_FLOCK64
   HAVE_STRUCT_STAT_ST_RDEV
   HAVE_STRUCT_TIMESPEC
   HAVE_ST_RDEV
   HAVE_SYMLINK
   HAVE_SYSCALL
   HAVE_SYSCONF
   HAVE_SYSLOG
   HAVE_SYS_QUOTAS
   HAVE_TIMEGM
   HAVE_UNIXSOCKET
   HAVE_UPDWTMP
   HAVE_UPDWTMPX
   HAVE_USLEEP
   HAVE_UTIMBUF
   HAVE_UTIME
   HAVE_UTIMES
   HAVE_VASPRINTF
   HAVE_VASPRINTF_DECL
   HAVE_VA_COPY
   HAVE_VOLATILE
   HAVE_VSNPRINTF
   HAVE_VSNPRINTF_DECL
   HAVE_VSYSLOG
   HAVE_WAITPID
   HAVE_WORKING_AF_LOCAL
   HAVE_XFS_QUOTAS
   HAVE_YP_GET_DEFAULT_DOMAIN
   HAVE___CLOSE
   HAVE___DUP2
   HAVE___FCNTL
   HAVE___FORK
   HAVE___FSTAT
   HAVE___FXSTAT
   HAVE___LSEEK
   HAVE___LSTAT
   HAVE___LXSTAT
   HAVE___OPEN
   HAVE___OPEN64
   HAVE___PREAD64
   HAVE___PWRITE64
   HAVE___READ
   HAVE___STAT
   HAVE___WRITE
   HAVE___XSTAT

 --with Options:
   WITH_CIFSMOUNT
   WITH_QUOTAS
   WITH_SENDFILE
   WITH_UTMP

 Build Options:
   COMPILER_SUPPORTS_LL
   DEFAULT_DISPLAY_CHARSET
   DEFAULT_DOS_CHARSET
   DEFAULT_UNIX_CHARSET
   LDAP_SET_REBIND_PROC_ARGS
   LINUX
   LINUX_SENDFILE_API
   PACKAGE_BUGREPORT
   PACKAGE_NAME
   PACKAGE_STRING
   PACKAGE_TARNAME
   PACKAGE_VERSION
   REALPATH_TAKES_NULL
   REPLACE_GETPASS
   RETSIGTYPE
   SEEKDIR_RETURNS_VOID
   SIZEOF_DEV_T
   SIZEOF_INO_T
   SIZEOF_INT
   SIZEOF_LONG
   SIZEOF_LONG_LONG
   SIZEOF_OFF_T
   SIZEOF_SHORT
   STAT_STATVFS64
   STAT_ST_BLOCKSIZE
   STDC_HEADERS
   STRING_STATIC_MODULES
   SYSCONF_SC_NGROUPS_MAX
   SYSCONF_SC_NPROCESSORS_ONLN
   SYSCONF_SC_PAGESIZE
   TIME_WITH_SYS_TIME
   USE_SETRESUID
   WITH_CIFSMOUNT
   WITH_QUOTAS
   WITH_SENDFILE
   _FILE_OFFSET_BITS
   _GNU_SOURCE
   _LARGEFILE64_SOURCE
   _POSIX_C_SOURCE
   _POSIX_SOURCE
   auth_script_init
   charset_CP437_init
   charset_CP850_init
   inline
   offset_t
   static_decl_auth
   static_decl_charset
   static_decl_idmap
   static_decl_pdb
   static_decl_rpc
   static_decl_vfs
   static_init_auth
   static_init_charset
   static_init_idmap
   static_init_pdb
   static_init_rpc
   static_init_vfs
   vfs_audit_init
   vfs_cap_init
   vfs_default_quota_init
   vfs_expand_msdfs_init
   vfs_extd_audit_init
   vfs_fake_perms_init
   vfs_full_audit_init
   vfs_netatalk_init
   vfs_readonly_init
   vfs_recycle_init
   vfs_shadow_copy_init

Type sizes:
   sizeof(char):         1
   sizeof(int):          4
   sizeof(long):         4
   sizeof(long long):    8
   sizeof(uint8):        1
   sizeof(uint16):       2
   sizeof(uint32):       4
   sizeof(short):        2
   sizeof(void*):        4
   sizeof(size_t):       4
   sizeof(off_t):        8
   sizeof(ino_t):        8
   sizeof(dev_t):        8

Builtin modules:
    pdb_ldap pdb_smbpasswd pdb_tdbsam rpc_lsa rpc_reg rpc_lsa_ds rpc_wks rpc_svcctl rpc_ntsvcs rpc_net rpc_netdfs rpc_srv rpc_spoolss rpc_eventlog rpc_samr idmap_ldap idmap_tdb auth_sam auth_unix auth_winbind auth_server auth_domain auth_builtin
Comment 20 Daniel Beschorner 2006-07-25 13:58:15 UTC
Checked it against another server (Kernel 2.6.17/x64, SuSE 9.3), problem is reproducible.

Daniel
Comment 21 Jeremy Allison 2006-07-25 14:21:03 UTC
Can you check this on an older kernel ? Is it possible this is an issue with 2.6.17 ? Running here on 2.6.11.4-21.12-smp on SuSE 9.3 I don't see this problem.
Jeremy,
Comment 22 Daniel Beschorner 2006-07-25 15:18:19 UTC
You've got it :-)
2.6.16 works like charm.
Finally I know now why the save problems started without Samba change in end of June.

Maybe kernel oplocks and 2.6.17 could be some kind of dangerous.

Thanks Volker & Jeremy!

Daniel
Comment 23 Jeremy Allison 2006-07-25 15:41:26 UTC
Oh that's bad :-(. Now we need to get a kernel bug fixed, and that's much harder than fixing Samba....

Jeremy.
Comment 24 Orion Poplawski 2006-07-27 16:36:31 UTC
Has the kernel bug (fcntl(F_SETSIG) no longer working) been reported?  I've got a simple test program that demonstrates the problem.
Comment 25 Orion Poplawski 2006-07-27 16:37:29 UTC
*** Bug 3970 has been marked as a duplicate of this bug. ***
Comment 26 Jeremy Allison 2006-07-27 17:00:45 UTC
No it hasn't been reported. If you post your test case here I'll attach it to the SuSE bugzilla databases. I don't know where to post this for the kernel.org kernels (kernel mailing list ?).
Jeremy.
Comment 27 Orion Poplawski 2006-07-27 17:04:21 UTC
I've reported it to the Fedora bugzilla.  I'll post to the linux kernel list as well.

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=200471
Comment 28 Daniel Beschorner 2006-08-09 03:16:25 UTC
I've checked it again with this patch from the kernel mailing list and the problems are gone.
Thanks to Orion for bringing this on the list!

Daniel
------

fcntl(F_SETSIG) no longer works on leases because
lease_release_private_callback() gets called as the lease is copied in
order to initialise it.
The problem is that lease_alloc() performs an unnecessary initialisation,
which sets the lease_manager_ops. Avoid the problem by allocating the
target lease structure using locks_alloc_lock().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 fs/locks.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index b0b41a6..d7c5339 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1421,8 +1421,9 @@ static int __setlease(struct file *filp,
 	if (!leases_enable)
 		goto out;
 
-	error = lease_alloc(filp, arg, &fl);
-	if (error)
+	error = -ENOMEM;
+	fl = locks_alloc_lock();
+	if (fl == NULL)
 		goto out;
 
 	locks_copy_lock(fl, lease);
@@ -1430,6 +1431,7 @@ static int __setlease(struct file *filp,
 	locks_insert_lock(before, fl);
 
 	*flp = fl;
+	error = 0;
 out:
 	return error;
 }