I have a NetApp filer, which I use with CIFS from linux. The clients OS is Debian Etch with quite some backports, especially with an own kernel, KDE and OOo. The smbfs package is the one delivered with Etch: 3.0.24-6etch10. I'm using OOo 2.4.1. With my previous kernel 2.6.24.1 locking wasn't working perfect, but OOo was showing an IO/error message, if a second user opened the file - see examples below. Sometimes OOo was even showing the document as read-only, which is the correct handling. I want / need to switch to 2.6.26.3, but now opening the file from OOo on the CIFS share is very slow. First testers thought that OOo had completely hung up, but if one waits long enought (20 minutes, I guess), OOo starts up correctly and shows the document. The test document just has 6kb - a single line of text. As a workaround I tested 1. a forward port from 2.6.24 to 2.6.26 (CIFS v1.52) - I just had to apply 4 patches: - cifs-no-iget-ce634ab28e7dbcc13ebe6e7bc5bc7de4f8def4c8.patch - cifs-pagecache-zeroing-eebd2aa355692afaf9906f62118620f1a1c19dbb.patch - cifs-remove-proc-root-fs-36a5aeb8787fbf92510ed20d806e229c55726f93.patch - cifs-sane-umount-begin-42faad99658eed7ca8bd328ffa4bcb7d78c9bcca.patch this compiles but has the same problem 2. a backport from 2.6.27-rc5-git6 (CIFS v1.54): - cifs-DFS-connects-inode-with-dfs-handling.patch - cifs-pass-path-to-do-add-mount.diff - cifs-drop-kmem-cache-argument-from-constructor.patch - cifs-sanitize-permission-prototype-patch - cifs-generic_llseek.diff - cifs-tryloc-page-rename.diff same problem But it seems that the locking is working "correctly". If I open the document with the 2.6.26 kernel, and switch to the other computer running 2.6.24, I get the OOo IO error message. So my guess would be, that some kernel semantics have changed, as the old module has the same problem with the new kernel. There is also a strange OOo phenomenon: P1: Open document - ok P2: Open document - kind of fails - you get the OOo filter selection dialog. I guess OOo opens a zero size document - on selection you get the IO error msg. P2: Open document again - see previous. On the other hand: P1: Open document - ok P2: Open document - fails with filter dialog P1: Close document P2: Open document - ok P1: Open document - ok, opend in read-only mode When mounted I get this debug info: # cat /proc/fs/cifs/DebugData Display Internal CIFS Data Structures for Debugging --------------------------------------------------- CIFS Version 1.52 Active VFS Requests: 0 Servers: 1) Name: 172.16.2.221 Domain: TVC.MUENCHEN.DE Mounts: 1 OS: Windows 5.0 NOS: Windows 2000 LAN Manager Capability: 0xd3fd SMB session status: 1 TCP status: 1 Local Users To Server: 1 SecMode: 0x3 Req On Wire: 0 MIDs: Shares: 1) \\netapp-fas250/\cifs Uses: 1 Type: NTFS DevInfo: 0x20 Attributes: 0x4000f PathComponentMax: 255 Status: 1 type: DISK The module is compiled with the kernel config (make -C /usr/src/linux-headers-2.6.26-1-686 M=`pwd`) All tested kernels have the following CIFS config: CONFIG_CIFS=m # CONFIG_CIFS_STATS is not set CONFIG_CIFS_WEAK_PW_HASH=y CONFIG_CIFS_XATTR=y CONFIG_CIFS_POSIX=y # CONFIG_CIFS_DEBUG2 is not set # CONFIG_CIFS_EXPERIMENTAL is not set I also tried to enabled CONFIG_CIFS_EXPERIMENTAL, but this didn't help or change anything. strace'ing OOo didn't help me understand what's going on. soffice calls a lot of sched_yield, when blocked and I can see some polls on the unix socket to X timeouts every few thousend yields. The log is more the 100 MB - I can send a compressed file or put it on some webspace, if needed.
From investigating the OOo strace, it seems, that the process is actually waiting for a futex, which always returns ETIMEDOUT.
This is actually a problem of the combination 1. on-access virus scanner via dazuko 2. OOo locking If I disable dazuko via kernel cmdline, locking works fast again. The strange OOo lock handling still doesn't change, but compared to the FS stalls it's a minor problem.