Bug 13396 - notifyd cored during unregistering notify
Summary: notifyd cored during unregistering notify
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services (show other bugs)
Version: 4.5.1
Hardware: x64 Linux
: P5 major (vote)
Target Milestone: ---
Assignee: Volker Lendecke
QA Contact: Samba QA Contact
Depends on:
Reported: 2018-04-20 15:16 UTC by li-yangzhao
Modified: 2018-05-08 02:10 UTC (History)
1 user (show)

See Also:

callback full (11.58 KB, text/plain)
2018-04-20 15:16 UTC, li-yangzhao
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description li-yangzhao 2018-04-20 15:16:09 UTC
Created attachment 14161 [details]
callback full

Dear Samba Team,

My samba server which has two nodes runs in cluster mode. During longterm file copy, i found one of the node has a smbd core. I dig in the callback and found that during notifyd(one of the smbd subprocess) applying unregister msg from peer, a invalid watch context stored in db caused segment fault. I'm not sure wherether the db was corrupted before or only that watch context was invalid. Any help?

The callback is list as below, Please find full callbacks in attachment.

(gdb) bt
#0  0x00007f6c950595f7 in raise () from /lib64/libc.so.6
#1  0x00007f6c9505ace8 in abort () from /lib64/libc.so.6
#2  0x00007f6c96a276db in dump_core () at ../source3/lib/dumpcore.c:322
#3  0x00007f6c96a18f27 in smb_panic_s3 (why=<optimized out>) at ../source3/lib/util.c:814
#4  0x00007f6c99768cdf in smb_panic (why=why@entry=0x7f6c997b608a "internal error") at ../lib/util/fault.c:166
#5  0x00007f6c99768ef6 in fault_report (sig=<optimized out>) at ../lib/util/fault.c:83
#6  sig_fault (sig=<optimized out>) at ../lib/util/fault.c:94
#7  <signal handler called>
#8  talloc_chunk_from_ptr (ptr=<optimized out>) at ../lib/talloc/talloc.c:429
#9  _talloc_free (ptr=0x2b3392fda0, location=0x7f6c9902f238 "../source3/smbd/notifyd/notifyd.c:479") at ../lib/talloc/talloc.c:1671
#10 0x00007f6c98e7b205 in notifyd_apply_rec_change (client=client@entry=0x5f222c65c0, path=0x5f22306a80 "/fs/cifs_system1_1/cifs_11", 
    pathlen=<optimized out>, chg=0x5f22306a60, entries=0x5f222cc220, sys_notify_watch=0x7f6c98f5e610 <inotify_watch>, sys_notify_ctx=0x5f2232af40, 
    msg_ctx=0x5f222c61e0) at ../source3/smbd/notifyd/notifyd.c:479
#11 0x00007f6c98e7b9f8 in notifyd_apply_reclog (msglen=<optimized out>, msg=<optimized out>, peer=0x5f22306920) at ../source3/smbd/notifyd/notifyd.c:1349
#12 notifyd_snoop_broadcast (src_vnn=<optimized out>, dst_vnn=<optimized out>, dst_srvid=<optimized out>, msg=<optimized out>, msglen=<optimized out>, 
    private_data=<optimized out>) at ../source3/smbd/notifyd/notifyd.c:1431
#13 0x00007f6c967efe6a in ctdbd_msg_call_back (msg=msg@entry=0x5f222ee060, conn=<optimized out>) at ../source3/lib/ctdbd_conn.c:168
#14 0x00007f6c967f081e in ctdb_handle_message (hdr=0x5f222ee060, conn=<optimized out>) at ../source3/lib/ctdbd_conn.c:529
#15 ctdbd_socket_readable (conn=<optimized out>) at ../source3/lib/ctdbd_conn.c:545
#16 0x00007f6c9847cc6b in epoll_event_loop (tvalp=0x7ffdec98aae0, epoll_ev=0x5f222c2180) at ../lib/tevent/tevent_epoll.c:728
#17 epoll_event_loop_once (ev=<optimized out>, location=<optimized out>) at ../lib/tevent/tevent_epoll.c:926
#18 0x00007f6c9847b137 in std_event_loop_once (ev=0x5f222de140, location=0x7f6c9847d3c8 "../lib/tevent/tevent_req.c:264")
    at ../lib/tevent/tevent_standard.c:114
#19 0x00007f6c9847738d in _tevent_loop_once (ev=ev@entry=0x5f222de140, location=location@entry=0x7f6c9847d3c8 "../lib/tevent/tevent_req.c:264")
    at ../lib/tevent/tevent.c:533
#20 0x00007f6c9847855f in tevent_req_poll (req=0x5f22350080, ev=0x5f222de140) at ../lib/tevent/tevent_req.c:264
#21 0x0000005f20ff6d5a in smbd_notifyd_init (msg=msg@entry=0x5f222c61e0, interactive=interactive@entry=false, ppid=ppid@entry=0x5f22302248)
    at ../source3/smbd/server.c:431
#22 0x0000005f20fec4a5 in main (argc=<optimized out>, argv=<optimized out>) at ../source3/smbd/server.c:1861
Comment 1 Volker Lendecke 2018-05-04 05:21:37 UTC
Assuming you are running 4.5.1 as indicated I claim this is fixed with commit 4e9a55536f95 that went into 4.6.0. If you can't upgrade to a supported version, please apply that patch. If you can reproduce with this patch applied, please re-open this bug.
Comment 2 li-yangzhao 2018-05-08 02:10:56 UTC
(In reply to Volker Lendecke from comment #1)
thank you, i will try the patch.