Bug 3587 - Locking problem
Locking problem
Status: CLOSED FIXED
Product: Samba 3.0
Classification: Unclassified
Component: File Services
3.0.21c
x64 Linux
: P1 critical
: 3.0.23
Assigned To: Jeremy Allison
Samba QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2006-03-06 10:03 UTC by Daniel Beschorner
Modified: 2006-06-29 04:23 UTC (History)
3 users (show)

See Also:


Attachments
log when opening failed (47.37 KB, text/plain)
2006-03-06 10:04 UTC, Daniel Beschorner
no flags Details
Patch for 3.0.23. (4.02 KB, patch)
2006-05-02 21:10 UTC, Jeremy Allison
no flags Details
log file 3.022 and 3.023rc2 (241.39 KB, text/plain)
2006-06-14 05:18 UTC, Johan Meiring
no flags Details
Proposed patch... (782 bytes, patch)
2006-06-14 05:44 UTC, Volker Lendecke
no flags Details
Patch for 3.0.22 (1.83 KB, patch)
2006-06-27 04:45 UTC, Daniel Beschorner
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Beschorner 2006-03-06 10:03:34 UTC
I don't know wether this is a samba bug or if the log actually can enlighten it.
I couldn't copy my outlook pst under XP, "other process has opened it", but there were no linux locks (lsof) and obviously no Samba locks (smbstatus).
After I copied the file under linux, the copy worked fine.
Comment 1 Daniel Beschorner 2006-03-06 10:04:36 UTC
Created attachment 1777 [details]
log when opening failed
Comment 2 Daniel Beschorner 2006-05-02 05:25:39 UTC
We could reproduce it one more time, interesting is a smbstatus -b :

Byte range locks:
   Pid     dev:inode  R/W      start        size
------------------------------------------------
...
   25240   00904:592404f    W        480           1
   25240   00904:592404f    W        481           1
...

The Pid doesn't exist anymore in the normal smbstatus output, but the byte range locks are still there.
Comment 3 Daniel Beschorner 2006-05-02 05:33:25 UTC
The command is "smbstatus -B".
Comment 4 Jeremy Allison 2006-05-02 08:45:53 UTC
Ok, I see the problem. I will have a fix checked into 3.0.23 for this later today. Do you need a fix back-ported for 3.0.22 ? Jerry, this is a show-stopper for 3.0.23.
Jeremy.
Comment 5 Gerald (Jerry) Carter 2006-05-02 08:49:25 UTC
setting milestone
Comment 6 Daniel Beschorner 2006-05-02 09:29:42 UTC
If I speak for myself, I can live with a solution in 3.0.23, but others might not.
I don't know how long the bug was hidden in 3.0.x and how critical it is, but we hit it really occasionally.
Comment 7 Jeremy Allison 2006-05-02 21:10:39 UTC
Created attachment 1886 [details]
Patch for 3.0.23.

Should fix both smbstatus -b and smbd running.
Jeremy.
Comment 8 Gerald (Jerry) Carter 2006-05-19 09:45:40 UTC
Fixed by jra.
Comment 9 Johan Meiring 2006-06-13 14:43:51 UTC
(In reply to comment #7)
> Created an attachment (id=1886) [edit]
> Patch for 3.0.23.
> 
> Should fix both smbstatus -b and smbd running.
> Jeremy.
> 

I have also run into this bug and it affecting my clients pretty severly.  (If there is a power failure while Outlook is open, you cannot access your PST file again until a unix copy and we are having severe power issues at the moment.)

I tried to apply the 3.0.23 patch to 3.0.22.  It does not apply cleanly.  I saw that 3.0.23rc2 is out, but unsure whether I should run it.  (3-4 production serers).

If at all possible, I would appreciate a backport to 3.0.22.
Comment 10 Volker Lendecke 2006-06-13 14:49:07 UTC
> I have also run into this bug and it affecting my clients pretty severly.  (If
> there is a power failure while Outlook is open, you cannot access your PST file
> again until a unix copy and we are having severe power issues at the moment.)
> 
> I tried to apply the 3.0.23 patch to 3.0.22.  It does not apply cleanly.  I saw
> that 3.0.23rc2 is out, but unsure whether I should run it.  (3-4 production
> serers).
> 
> If at all possible, I would appreciate a backport to 3.0.22.
> 

The problem you describe might actually be different. Could you try

reset on zero vc = yes

in the [global] section?

Thanks,

Volker
Comment 11 Johan Meiring 2006-06-13 14:57:07 UTC
> 
> The problem you describe might actually be different. Could you try
> 
> reset on zero vc = yes
> 
> in the [global] section?
> 

I am prepared to try it a one of the clients tomorrow, but dont quite agree with the assesment.

After the client came back and outlook could not open,
I confirmed the following:
- smbstatus did not show the old samba process or locked pst file.
- checked the inode of the file using ls -i
- smbstatus -B listed the inode (convered to hex of course)
- ps -ef confirms that the pid listed above (with smbstatus -B) does not exist.

Johan
Comment 12 Volker Lendecke 2006-06-13 15:02:03 UTC
> I am prepared to try it a one of the clients tomorrow, but dont quite agree
> with the assesment.
> 
> After the client came back and outlook could not open,
> I confirmed the following:
> - smbstatus did not show the old samba process or locked pst file.
> - checked the inode of the file using ls -i
> - smbstatus -B listed the inode (convered to hex of course)
> - ps -ef confirms that the pid listed above (with smbstatus -B) does not exist.

Ok, if that smbd was really gone, then you're right. It might be worth a try nevertheless, because with this option smbd might excercise a different path to exiting.

Volker
Comment 13 Gerald (Jerry) Carter 2006-06-13 15:10:49 UTC
> I have also run into this bug and it affecting my clients 
> pretty severly.  (If there is a power failure while Outlook is 
> open, you cannot access your PST file again until a unix copy 
> and we are having severe power issues at the moment.)

Are you sure this is just not a corrupt PST file?  if you have 
a power outage, it's likely that things will be in an odd state
anways.  I'm pretty sure the original bug has been fixed.  
Comment 14 Johan Meiring 2006-06-13 16:30:35 UTC
(In reply to comment #13)
> Are you sure this is just not a corrupt PST file?  if you have 
> a power outage, it's likely that things will be in an odd state
> anways.  I'm pretty sure the original bug has been fixed.  
> 

I think I am being misuderstood here.  I am experiencing the same problem as described by the original bug poster.

The problem/symptoms was confirmed on more than one workstation on more than one site with more than one application (outlook as well as quickbooks).  In ALL cases smbstatus -B listed byte range locks for smb processes that did not exist anymore.  Because of many power faulures I started seeing a pattern and are now able to reproduce the symptoms.

I am simply requesting a patch to Samba 3.0.22 as "offered" by Jeremy in comment#4  (The patch uploaded by Jeremy is for 3.0.23 which is not in production yet and the patch does not apply cleanly to 3.0.22).

Johan



The 
Comment 15 Jeremy Allison 2006-06-13 16:48:52 UTC
I'd appreciate you trying 3.0.23RC2 first. We're trying to stabilize this code and get it released, and more testing would definately help.
Jeremy.
Comment 16 Johan Meiring 2006-06-14 01:43:08 UTC
(In reply to comment #15)
> I'd appreciate you trying 3.0.23RC2 first. We're trying to stabilize this code
> and get it released, and more testing would definately help.
> Jeremy.
> 

Will do today and provide some feedback.
Comment 17 Johan Meiring 2006-06-14 05:18:13 UTC
Created attachment 1962 [details]
log file 3.022 and 3.023rc2

I installed 3.0.23rc2 as requested.  Will only be able to tell if bug is fixed after some time has passed.

The new release has started "core dumping" on me though.  Log file attached.  
Upgrade was done "2006/06/14 11:41:01"
After this, a core dump approximately every minute.
Apart from that it seems to work.
Comment 18 Volker Lendecke 2006-06-14 05:44:48 UTC
Created attachment 1963 [details]
Proposed patch...

Just yesterday I fixed a bug found by Klocwork that might help with our segfaults.

Can you try this patch?

Volker
Comment 19 Johan Meiring 2006-06-14 06:48:39 UTC
(In reply to comment #18)
> 
> Just yesterday I fixed a bug found by Klocwork that might help with our
> segfaults.
> 
> Can you try this patch?
> 
> Volker
> 
I install 3.0.23rc2 with patch at a second client.  No PANIC's.
I cannot restart first client's samba (with PANIC's) again today.  Server is production server.  Will do samba restart tonight, and monitor tomorrow.  (it is +- 1:45pm here).

Seems that problem (PANIC) is resolved, but will report back tomorrow (on both PANIC and byte range lock).

Johan
Comment 20 Johan Meiring 2006-06-21 05:14:54 UTC
(In reply to comment #19)
> (In reply to comment #18)
> > 
> > Just yesterday I fixed a bug found by Klocwork that might help with our
> > segfaults.
> > 
> Seems that problem (PANIC) is resolved, but will report back tomorrow (on both
> PANIC and byte range lock).
> 

I am happy to report that 3.0.23rc fixes the byte range locking issue.  (Outlook can now re-open its PST after a crash).
The patch supplied above also fixes the segfault in 3.0.23rc2

Thanks all!!!
(I dare M$ to provide such efficient and fast support)
Comment 21 Volker Lendecke 2006-06-21 05:20:27 UTC
Okay -- now we have a problem..... ;-)

The error I fixed for Klocwork was patching an error path that only was excercised when secrets.tdb contained invalid data. How did the data get there in the first place???

But this is nothing that should bother you (the bug reporter...) :-)

Volker
Comment 22 Daniel Beschorner 2006-06-23 09:48:01 UTC
Jeremy,

if I strip the first and last part of the 3.0.23 patch, is it usable for 21c/22?
I need it because we hit the bug more often und 3.0.23 isn't ready yet for our production environment.

Daniel
Comment 23 Daniel Beschorner 2006-06-27 04:45:58 UTC
Created attachment 1990 [details]
Patch for 3.0.22

If I use the attached patch in 3.0.22 I get segmentation faults or free(): invalid pointer: 0x08166d80 on some boxes while doing smbstatus -B.

Is it wrong or insufficient to adapt it this way for 3.0.22?
Comment 24 Jeremy Allison 2006-06-27 10:36:41 UTC
Please try out 3.0.23RC3. We're trying to get this ready for release and it'll be more stable than 3.0.22 with a hand-ported patch. I don't have time to prepare a custom version of this patch for you, sorry.

Jeremy.
Comment 25 Daniel Beschorner 2006-06-27 11:00:46 UTC
Should I've done, but it's some kind of major release without clear guidelines (yet) how to bridge the SID mapping change etc.
But you're right, a RC3 is worth production testing and I can see how our other issues (kernel oplocks) behave under RC3.

Daniel
Comment 26 Daniel Beschorner 2006-06-29 04:23:25 UTC
Looks good for 3.0.23RC3 at first glance.

Daniel