Bug 9876 - crash in file_find_dif
Summary: crash in file_find_dif
Status: NEW
Alias: None
Product: Samba 4.0
Classification: Unclassified
Component: File services (show other bugs)
Version: 4.0.3
Hardware: x64 Linux
: P5 major (vote)
Target Milestone: ---
Assignee: Samba QA Contact
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-05-09 14:02 UTC by Dustin Ward
Modified: 2016-02-12 09:19 UTC (History)
4 users (show)

See Also:


Attachments
Config, Log, and Backtrace (12.58 KB, text/plain)
2013-05-09 14:02 UTC, Dustin Ward
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dustin Ward 2013-05-09 14:02:20 UTC
Created attachment 8875 [details]
Config, Log, and Backtrace

Periodically we've seen the following crash when adobe indesign is accessing files over a samba share. 

[2013/05/07 16:33:23.991511,  0] smbd/files.c:330(file_find_dif)
  file_find_dif: file resources/foo.png file_id = 17:15089d:0, gen = 137814652 oplock_type = 2 is a stat open with oplock type !
[2013/05/07 16:33:23.991746,  0] lib/util.c:810(smb_panic_s3)
  PANIC (pid 20692): file_find_dif
[2013/05/07 16:33:23.997257,  0] lib/util.c:921(log_stack_trace)
  BACKTRACE: 20 stack frames:
   #0 smbd(log_stack_trace+0x1c) [0x7fd19e07a2dc]
   #1 smbd(smb_panic_s3+0x2b) [0x7fd19e07a3ab]
   #2 smbd(smb_panic+0x3f) [0x7fd19e06c42f]
   #3 smbd(file_find_dif+0x141) [0x7fd19dd3bc61]
   #4 smbd [0x7fd19e0c3bf6]
   #5 smbd [0x7fd19e0c4433]
   #6 smbd(messaging_dispatch_rec+0x5d) [0x7fd19e0592dd]
   #7 smbd [0x7fd19e05ada1]
   #8 smbd(tevent_common_check_signal+0x264) [0x7fd19e08ad54]
   #9 smbd(run_events_poll+0x26) [0x7fd19e088936]
   #10 smbd [0x7fd19e089022]
   #11 smbd(_tevent_loop_once+0x90) [0x7fd19e089400]
   #12 smbd(smbd_process+0xada) [0x7fd19ddad16a]
   #13 smbd [0x7fd19e31b35b]
   #14 smbd(run_events_poll+0x338) [0x7fd19e088c48]
   #15 smbd [0x7fd19e08909a]
   #16 smbd(_tevent_loop_once+0x90) [0x7fd19e089400]
   #17 smbd(main+0x176e) [0x7fd19e31cf1e]
   #18 /lib64/libc.so.6(__libc_start_main+0xf4) [0x7fd19ac29994]
   #19 smbd [0x7fd19dd1bbd9]

No error is shown on the client as it also crashes indesign.
I've attached a backtrace and a copy of our config. 

This problem started occurring once we upgraded from Samba 3.0.33 to Samba 4.0.x. 

Please let me know if there is any more information I can provide.
Comment 1 Jeremy Allison 2013-05-09 17:45:30 UTC
Can you get a backtrace with symbols please ?

Do at "bt full" from gdb.

Also do you have a good way of reproducing ? A network capture trace and complete debug level 10 log would also help.

Thanks !

Jeremy.
Comment 2 Dustin Ward 2013-05-09 20:37:47 UTC
A bt full with symbols is in the attachment. I'm not able to reproduce it yet. It doesn't happen very often. I'll also see if I can get debug output.
Comment 3 Hemanth 2014-11-04 01:20:14 UTC
Found same issue at one of customers with samba 3.6.12+ stack. Here is the bt.

(gdb) bt
#0  0x0000000802cfaffc in thr_kill () from /lib/libc.so.7
#1  0x0000000802d9658b in abort () from /lib/libc.so.7
#2  0x0000000000793b51 in dump_core () at lib/fault.c:414
#3  0x00000000007a39ef in smb_panic (why=<optimized out>) at lib/util.c:1133
#4  0x00000000004b3cf1 in file_find_dif (sconn=0x805d26110, id=..., gen_id=2448996612) at smbd/files.c:323
#5  0x00000000007e3ff4 in initial_break_processing (sconn=0x803711350, id=..., file_id=2448996612) at smbd/oplock.c:284
#6  0x00000000007e49bd in process_oplock_break_message (msg_ctx=0x8037380d0, private_data=<optimized out>, msg_type=<optimized out>, src=..., data=0x805f5eef8) at smbd/oplock.c:516
#7  0x00000000007856b9 in messaging_dispatch_rec (msg_ctx=0x18b26, rec=0x805f5eed0) at lib/messages.c:376
#8  0x00000000007874e1 in message_dispatch (msg_ctx=0x8037380d0) at lib/messages_local.c:516
#9  messaging_tdb_signal_handler (ev_ctx=<optimized out>, se=<optimized out>, signum=<optimized out>, count=<optimized out>, _info=<optimized out>, private_data=<optimized out>)
    at lib/messages_local.c:77
#10 0x00000000007b3c84 in tevent_common_check_signal (ev=0x80370e110) at ../lib/tevent/tevent_signal.c:395
#11 0x00000000007b1819 in run_events_poll (ev=0x18b26, pollrtn=0, pfds=0x0, num_pfds=0) at lib/events.c:192
#12 0x0000000000521dfd in smbd_server_connection_loop_once (conn=<optimized out>) at smbd/process.c:999
#13 smbd_process (sconn=0x803711350) at smbd/process.c:3172
#14 0x0000000000a01a26 in smbd_accept_connection (ev=<optimized out>, fde=<optimized out>, flags=<optimized out>, private_data=<optimized out>) at smbd/server.c:671
#15 0x00000000007b1b51 in run_events_poll (ev=0x80370e110, pollrtn=<optimized out>, pfds=0x803710090, num_pfds=7) at lib/events.c:286
#16 0x00000000007b1fbf in s3_event_loop_once (ev=0x80370e110, location=<optimized out>) at lib/events.c:349
#17 0x00000000007b2371 in _tevent_loop_once (ev=0x80370e110, location=0xc0c429 "smbd/server.c:977") at ../lib/tevent/tevent.c:494
#18 0x0000000000a0374e in smbd_parent_loop (parent=<optimized out>) at smbd/server.c:977
#19 main (argc=<optimized out>, argv=<optimized out>) at smbd/server.c:1471

Line numbers may not be matching to 3.6.12 sources as we have taken few critical/security patches in between.

(gdb) f 4
#4  0x00000000004b3cf1 in file_find_dif (sconn=0x805d26110, id=..., gen_id=2448996612) at smbd/files.c:323
323     smbd/files.c: No such file or directory.
(gdb) info locals
count = 2
fsp = 0x805e3b950
__FUNCTION__ = "file_find_dif"
(gdb) p *fsp
$1 = {next = 0x805e3b410, prev = 0x806091c50, fnum = 35956, conn = 0x80375ec50, fh = 0x8037705d0, num_smb_operations = 0, file_id = {devid = 3010720835, inode = 2403227739, extid = 0},
  initial_allocation_size = 0, mode = 33279, file_pid = 65279, vuid = 40169, wcp = 0x0, open_time = {tv_sec = 1412709054, tv_usec = 298804}, access_mask = 1179785, share_access = 3,
  requested_disposition = 1, update_write_time_triggered = false, update_write_time_event = 0x0, update_write_time_on_close = false, close_write_time = {tv_sec = 0, tv_nsec = 0},
  write_time_forced = false, oplock_type = 2, sent_oplock_break = 0, oplock_timeout = 0x0, last_lock_failure = {context = {smblctx = 0, tid = 0, pid = {pid = 0, vnn = 0, unique_id = 0}},
    start = 0, size = 0, fnum = 0, lock_type = READ_LOCK, lock_flav = WINDOWS_LOCK}, current_lock_count = 0, pending_break_messages = 0x0, num_pending_break_messages = 0, open_action = 1,
  can_lock = true, can_read = true, can_write = false, modified = false, is_directory = false, aio_write_behind = false, lockdb_clean = true, initial_delete_on_close = false,
  delete_on_close = false, posix_open = false, is_sparse = false, is_rofs = false, fsp_name = 0x805ec3150, name_hash = 1983912272, vfs_extension = 0x0, fake_file_handle = 0x0,
  notify = 0x0, base_fsp = 0x0, brlock_seqnum = 23245, brlock_rec = 0x805d1d1d0, dptr = 0x0, print_file = 0x0}
(gdb) p fsp->fh
$2 = (struct fd_handle *) 0x8037705d0
(gdb) p *fsp->fh
$3 = {ref_count = 1, fd = -1, position_information = 75, pos = 75, private_options = 0, gen_id = 2448996612}
(gdb) info args
sconn = 0x805d26110
id = {devid = 3010720835, inode = 2403227739, extid = 0}
gen_id = 2448996612

fsp dump shows that oplock as BATCH_OPLOCK. But not sure how fd become -1 and oplock as BATCH.