Bug 11761 - Lock order violation, smbd dumps core
Summary: Lock order violation, smbd dumps core
Status: NEW
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services (show other bugs)
Version: 4.1.20
Hardware: All FreeBSD
: P5 normal (vote)
Target Milestone: ---
Assignee: Samba QA Contact
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-02-29 16:22 UTC by Jakub Klama (mail address dead)
Modified: 2018-12-18 22:43 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jakub Klama (mail address dead) 2016-02-29 16:22:28 UTC
Feb 28 22:22:02 freenas smbd[92728]: [2016/02/28 22:22:02.001781,  0] ../lib/dbwrap/dbwrap.c:193(dbwrap_check_lock_order)
Feb 28 22:22:02 freenas smbd[92728]:   Lock order violation: Trying /var/db/samba4/file_ntacls.tdb at 1 while /var/db/samba4/locking.tdb at 1 is locked
Feb 28 22:22:02 freenas smbd[92728]: [2016/02/28 22:22:02.001927,  0] ../lib/dbwrap/dbwrap.c:133(debug_lock_order)
Feb 28 22:22:02 freenas smbd[92728]:   lock order:  1:/var/db/samba4/locking.tdb 2:<none> 3:<none>
Feb 28 22:22:02 freenas smbd[92728]: [2016/02/28 22:22:02.001982,  0] ../source3/lib/util.c:785(smb_panic_s3)
Feb 28 22:22:02 freenas smbd[92728]:   PANIC (pid 92728): invalid lock_order
Feb 28 22:22:02 freenas smbd[92728]: [2016/02/28 22:22:02.002163,  0] ../lib/util/fault.c:72(fault_report)
Feb 28 22:22:02 freenas smbd[92728]:   ===============================================================
Feb 28 22:22:02 freenas smbd[92728]: [2016/02/28 22:22:02.002228,  0] ../lib/util/fault.c:73(fault_report)
Feb 28 22:22:02 freenas smbd[92728]:   INTERNAL ERROR: Signal 10 in pid 92728 (4.1.21)
Feb 28 22:22:02 freenas smbd[92728]:   Please read the Trouble-Shooting section of the Samba HOWTO
Feb 28 22:22:02 freenas smbd[92728]: [2016/02/28 22:22:02.002299,  0] ../lib/util/fault.c:75(fault_report)
Feb 28 22:22:02 freenas smbd[92728]:   ===============================================================
Feb 28 22:22:02 freenas smbd[92728]: [2016/02/28 22:22:02.002356,  0] ../source3/lib/util.c:785(smb_panic_s3)
Feb 28 22:22:02 freenas smbd[92728]:   PANIC (pid 92728): internal error
Feb 28 22:22:02 freenas smbd[92728]: [2016/02/28 22:22:02.002616,  0] ../source3/lib/util.c:896(log_stack_trace)
Feb 28 22:22:02 freenas smbd[92728]:   BACKTRACE: 5 stack frames:
Feb 28 22:22:02 freenas smbd[92728]:    #0 0x80207920c <smb_panic_s3+111> at /usr/local/lib/libsmbconf.so.0
Feb 28 22:22:02 freenas smbd[92728]:    #1 0x80084716f <smb_panic+40> at /usr/local/lib/libsamba-util.so.0
Feb 28 22:22:02 freenas smbd[92728]:    #2 0x800846e7c <fault_setup_disable+410> at /usr/local/lib/libsamba-util.so.0
Feb 28 22:22:02 freenas smbd[92728]:    #3 0x800846e91 <fault_setup_disable+431> at /usr/local/lib/libsamba-util.so.0
Feb 28 22:22:02 freenas smbd[92728]:    #4 0x803884236 <pthread_sigmask+918> at /lib/libthr.so.3
Feb 28 22:22:02 freenas smbd[92728]: [2016/02/28 22:22:02.002789,  0] ../source3/lib/util.c:797(smb_panic_s3)
Feb 28 22:22:02 freenas smbd[92728]:   smb_panic(): calling panic action [/usr/local/libexec/samba/samba-backtrace]
Feb 28 22:22:02 freenas smbd[92728]: [2016/02/28 22:22:02.062068,  0] ../source3/lib/util.c:805(smb_panic_s3)
Feb 28 22:22:02 freenas smbd[92728]:   smb_panic(): action returned status 2
Feb 28 22:22:02 freenas smbd[92728]: [2016/02/28 22:22:02.062623,  0] ../source3/lib/dumpcore.c:317(dump_core)
Feb 28 22:22:02 freenas smbd[92728]:   dumping core in /var/db/system/cores
Feb 28 22:22:02 freenas smbd[92728]: 
Feb 28 22:22:02 freenas kernel: pid 92728 (smbd), uid 0: exited on signal 6 (core dumped)
Comment 1 Volker Lendecke 2016-02-29 20:19:47 UTC
Can you try setting DBWRAP_LOCK_ORDER_1 to DBWRAP_LOCK_ORDER_2 in source3/modules/vfs_acl_tdb.c line 63?
Comment 2 Volker Lendecke 2016-02-29 20:20:23 UTC
BTW, you have ZFS with NFSv4 ACLs, why do you use this module?
Comment 3 Jakub Klama (mail address dead) 2016-04-02 13:56:10 UTC
Which module?
Comment 4 Volker Lendecke 2016-04-04 08:47:13 UTC
(In reply to Jakub Klama from comment #3)
> Which module?

acl_xattr
Comment 5 Michael Adam 2016-06-19 22:26:21 UTC
(In reply to Volker Lendecke from comment #4)
> (In reply to Jakub Klama from comment #3)
> > Which module?
> 
> acl_xattr

rather acl_tdb ?

It would be good to see the full smb.conf...
Comment 6 Christophuzzy R. Hertel 2017-03-29 01:15:45 UTC
I have reproduced this issue in the following Samba releases: 4.5.5, 4.5.7, 4.6.1.

The problem is decidedly connected to the acl_tdb module.

I can share my config, if you like, but I think that all that is needed is to load the acl_tdb module.  I'm running on Linux, compiling and installing from source.

(In reply to Volker Lendecke from comment #1)
I tested 4.6.1 as-is and, as expected, generated the crash by deleting a file.  With your proposed change (passing DBWRAP_LOCK_ORDER_2), my simple formula *no longer causes a process crash*.  So, it looks as though Volker's proposed fix could be the solution.

(In reply to Volker Lendecke from comment #2)
In my case, I am testing against a new file system that does not support EAs, ACLs, inotify, fcntl() locking, flock() locking, or kernel OpLocks.  (It does support a dollop of whipped cream with a maraschino cherry on top...so, we're good there.)

In my testing, I needed to know whether the bug was related to the new filesystem or not, so I ran the same test (same configuration) on an EXT4 filesystem and generated the same crash.  Using the file system's EA and ACL support, as was suggested in comment #2, would simply hide the bug.

...but who doesn't love a good game of hide the bug?
Comment 7 Max Vlasov 2018-11-21 10:26:52 UTC
I reported an issue related to samba/acl_tdb at FreeNas tracker (https://redmine.ixsystems.com/issues/58128), the developer there pointed possibly to this report when I asked about reporting to samba tracker directly. 

I suspect that my findings might be related to this samba issue since the similarities are
- "Lock order Violation", 
- file delete involved 
- acl_tdb module involved 

I can quote some reproducible actions. 

In my case the actions on windows client side are
- Open File1 for reading
- Delete File2
- Open(Create)File2
- Close File2
- Try To Read File1. This operation will fail. At this moment the samba lock contains lines starting with a "Lock order violation"

The code in the client that reproduces it (pascal but minor changes are required for example for c)

  AReadHandle := CreateFile(File1Path, GENERIC_READ,0, nil, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, 0);
  if AReadHandle > 0 then
  begin
    DeleteFile(File2Path);

    AWriteHandle := CreateFile(File2Path, GENERIC_READ or GENERIC_WRITE,0, nil, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, 0);
    if AWriteHandle > 0 then CloseHandle(AWriteHandle);

    if ReadFile(AReadHandle, Buf, SizeOf(Buf), ARead, nil) then
    begin
      // Ok on windows network shares, samba shares with acl_xattr
    end
    else
    begin
      // Failed on samba shares with acl_tdb VFS module
    end;
  end;
  if AReadHandle> 0 then CloseHandle(AReadHandle);