Bug 6150 - corrupted locking of MS Office documents
corrupted locking of MS Office documents
Status: RESOLVED WORKSFORME
Product: Samba 3.2
Classification: Unclassified
Component: File services
3.2.5
x86 Linux
: P3 normal
: ---
Assigned To: Volker Lendecke
Samba QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-03-02 06:18 UTC by Martin Povolny
Modified: 2009-08-24 01:27 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Povolny 2009-03-02 06:18:09 UTC
Hallo,

after samba upgrade from 3.0.10 on debian sarge to 3.0.26 on debian etch we have experienced problems probably related to locking.

From time to time, but quite often (several times a day) samba reports a locked file to the user.

The problem is with DOC and XLS files on one share, most of the time with a single XLS (but that one is the most frequently used). The users are using excel versions 2002,2003,2007. The user reporting most problems has excel 2003, but he is also the one working most with the share.

The file is reported as open even thought the user has closed it some time ago. Or, in the less serious variant, the file and shares take ages to open (1-5 minutes).

The smbstatus output shows garbage characters and data:

6625         32048884   DENY_NONE  0x100001    RDONLY     NONE
   ��      Tue Jan 13 04:16:18 1970
a radky:
PANIC: assert failed at locking/locking.c(996): num_props <= 1
0            648371     unknown-please report ! e->share_access = 0x5d5,
e->private_options = 0x0
0x6c        RDONLY     EXCLUSIVE+BATCH        Thu Jan  1 01:00:02 1970

Problem can me temporarily solved by stopping samba and removing  locking.tdb.

This is what we see in the log file during the long wait:

[2009/02/23 16:54:39, 5, effective(10020, 4), real(10020, 0)]
locking/posix.c:release_posix_lock_windows_flavour(1200)
  release_posix_lock_windows_flavour: Real unlock: offset = 2147483579,
count = 20
[2009/02/23 16:54:39, 3, effective(10020, 4), real(10020, 0)]
smbd/reply.c:reply_lockingX(5704)
  lockingX fnum=6979 type=16 num_locks=0 num_ulocks=1
[2009/02/23 16:54:39, 5, effective(10020, 4), real(10020, 0)]
locking/posix.c:set_posix_lock_windows_flavour(984)
  set_posix_lock_windows_flavour: File Evidence vydane dokumentace.xls,
offset = 2147483539, count = 1, type = WRITE
[2009/02/23 16:54:39, 5, effective(10020, 4), real(10020, 0)]
locking/posix.c:set_posix_lock_windows_flavour(1062)
  set_posix_lock_windows_flavour: Real lock: Type = READ: offset =
2147483539, count = 1
[2009/02/23 16:54:39, 3, effective(10020, 4), real(10020, 0)]
lib/util.c:fcntl_lock(2004)
  fcntl_lock: lock failed at offset 2147483539 count 1 op 13 type 0
(Resource temporarily unavailable)
[2009/02/23 16:54:39, 5, effective(10020, 4), real(10020, 0)]
locking/posix.c:set_posix_lock_windows_flavour(1067)
  set_posix_lock_windows_flavour: Lock fail !: Type = READ: offset =
2147483539, count = 1. Errno = Resource temporarily unav
ailable
[2009/02/23 16:54:39, 3, effective(10020, 4), real(10020, 0)]
smbd/blocking.c:push_blocking_lock_request(172)
  push_blocking_lock_request: lock request length=75 blocked with expiry
time (1235404479 sec. 818392 usec) (+200 msec) for f
num = 6979, name = Evidence vydane dokumentace.xls

We tried turning off oplocks for XLS and then for all files on the share. It did not help.

After these problems we upgraded to 3.2.5 but the problems are the same. Unfortunately downgrade to 3.0.10 where we did not experience there problems is not possible due to other software on the server.

All this happens only with one share so we as temporady work-around moved the share to a windows machine leaving the rest of the shares untouched.

This resulted into the machine operating normaly w/o any further problems, also the smbstatus output looks ok now.

Any help it hightly welcome.
Comment 1 Volker Lendecke 2009-03-12 11:10:37 UTC
Can you do a tdbbackup or directly copy away such a locking.tdb?

Just to change something in tdb access, can you try "use mmap = no"? I doubt this will help, but right now clue what this might be, so anything that narrows down the error space might give hints.

Volker
Comment 2 Petr Jurasek 2009-03-16 07:33:17 UTC
Hi,

corrupted locking.tdb from samba 3.2.5 is at:
http://www.solnet.cz/locking.tdb.3.2.5
file is to large to attach to bugzilla.

I'll try "use mmap = no", thx for idea.
Comment 3 Petr Jurasek 2009-05-13 04:55:21 UTC
(In reply to comment #2)
> corrupted locking.tdb from samba 3.2.5 is at:
> http://www.solnet.cz/locking.tdb.3.2.5
> file is to large to attach to bugzilla.
> 
> I'll try "use mmap = no", thx for idea.

Hi,

with this option we don't register this bug again. This bug occured randomly, so I was waiting longer.

Thanks for your time and solution.
Comment 4 Volker Lendecke 2009-05-24 14:41:06 UTC
Did it happen again, or did "use mmap = no" really help? If it did help, what is your exact hardware/kernel/software stack? Plain debian etch on a standard X86 uniprocessor box? Or anything fancy?

Volker
Comment 5 Martin Povolny 2009-07-29 09:01:15 UTC
sorry for the delay

yes, "use mmap = no" did really help.

we are running debian lenny with kernel 2.6.26-bpo.1-686 from backports.org, i386, the CPU is Intel Core2 duo
Comment 6 Martin Povolny 2009-07-29 09:02:11 UTC
(In reply to comment #5)
> sorry for the delay
> 
> yes, "use mmap = no" did really help.
> 
> we are running debian lenny
                        ^^^^^^

etch, sorry

Comment 7 Volker Lendecke 2009-08-24 01:27:27 UTC
Closing this bug as WORKSFORME. I don't really see a workable way to diagnose this remotely.

Sorry,

Volker