Created attachment 9374 [details]
core dump file
I added a Samba 4.1 member server to our Samba AD a few days ago. Since that time, every night I get have crashing smbd processes (Acronis stores Images from a bunch of workstations on a share during the night).
This is the (level 1) log output:
[2013/11/05 04:14:43.532768, 0, pid=91892] ../source3/lib/popt_common.c:67(popt_s3_talloc_log_fn)
talloc: access after free error - first free may be at ../source3/smbd/open.c:1529
[2013/11/05 04:14:43.532916, 0, pid=91892] ../source3/lib/popt_common.c:67(popt_s3_talloc_log_fn)
Bad talloc magic value - access after free
[2013/11/05 04:14:43.532976, 0, pid=91892] ../source3/lib/util.c:785(smb_panic_s3)
PANIC (pid 91892): Bad talloc magic value - access after free
[2013/11/05 04:14:43.533704, 0, pid=91892] ../source3/lib/util.c:896(log_stack_trace)
BACKTRACE: 22 stack frames:
#0 /usr/local/samba/lib/libsmbconf.so.0(log_stack_trace+0x1f) [0x7f2db6366c06]
#1 /usr/local/samba/lib/libsmbconf.so.0(smb_panic_s3+0x6d) [0x7f2db6366a75]
#2 /usr/local/samba/lib/libsamba-util.so.0(smb_panic+0x28) [0x7f2db81d3cfb]
#3 /usr/local/samba/lib/samba/libtalloc.so.2(+0x20a9) [0x7f2db760d0a9]
#4 /usr/local/samba/lib/samba/libtalloc.so.2(+0x2125) [0x7f2db760d125]
#5 /usr/local/samba/lib/samba/libtalloc.so.2(+0x21a3) [0x7f2db760d1a3]
#6 /usr/local/samba/lib/samba/libtalloc.so.2(talloc_get_name+0x18) [0x7f2db760ec83]
#7 /usr/local/samba/lib/samba/libtalloc.so.2(_talloc_get_type_abort+0x4c) [0x7f2db760ee03]
#8 /usr/local/samba/lib/libsmbconf.so.0(+0x31767) [0x7f2db6372767]
#9 /usr/local/samba/lib/samba/libtevent.so.0(tevent_common_loop_immediate+0x1f9) [0x7f2db73feee4]
#10 /usr/local/samba/lib/libsmbconf.so.0(run_events_poll+0x57) [0x7f2db638319b]
#11 /usr/local/samba/lib/libsmbconf.so.0(+0x42848) [0x7f2db6383848]
#12 /usr/local/samba/lib/samba/libtevent.so.0(_tevent_loop_once+0xfc) [0x7f2db73fdfa9]
#13 /usr/local/samba/lib/samba/libsmbd_base.so(smbd_process+0x1321) [0x7f2db7977a1e]
#14 /usr/sbin/smbd(+0x9c38) [0x7f2db883cc38]
#15 /usr/local/samba/lib/libsmbconf.so.0(run_events_poll+0x544) [0x7f2db6383688]
#16 /usr/local/samba/lib/libsmbconf.so.0(+0x4295e) [0x7f2db638395e]
#17 /usr/local/samba/lib/samba/libtevent.so.0(_tevent_loop_once+0xfc) [0x7f2db73fdfa9]
#18 /usr/sbin/smbd(+0xa8d7) [0x7f2db883d8d7]
#19 /usr/sbin/smbd(main+0x15d1) [0x7f2db883eff9]
#20 /lib64/libc.so.6(__libc_start_main+0xfd) [0x7f2db4bfbcdd]
#21 /usr/sbin/smbd(+0x5809) [0x7f2db8838809]
[2013/11/05 04:14:43.534362, 0, pid=91892] ../source3/lib/util.c:797(smb_panic_s3)
smb_panic(): calling panic action [/usr/local/bin/panic-action 91892]
[2013/11/05 04:14:44.128587, 0, pid=91892] ../source3/lib/util.c:805(smb_panic_s3)
smb_panic(): action returned status 0
[2013/11/05 04:14:44.128711, 0, pid=91892] ../source3/lib/dumpcore.c:317(dump_core)
dumping core in /var/log/samba//cores/smbd
- Core dump file
- gdb backtrace
- Level 1 Logfile
Created attachment 9375 [details]
Created attachment 9376 [details]
Level 1 logfile
Created attachment 9377 [details]
Perhaps related to bug #9903?
Similar backtrace, though I haven't seen that crash in a long time (months).
Created attachment 9421 [details]
I don't have a reproducer yet, but I bet this is it.
(In reply to comment #6)
> Created attachment 9421 [details]
> I don't have a reproducer yet, but I bet this is it.
I applied the patch to my 4.1.0 and recompiled. I'll give feedback during the next days.
Volker, the patch works great. The segfaults are gone.
Ok, thanks for the feedback. I have no clue how to reproduce this, which worries me a bit. Do you have a client that is hit in particular by this?
(In reply to comment #9)
> Ok, thanks for the feedback. I have no clue how to reproduce this, which
> worries me a bit. Do you have a client that is hit in particular by this?
Without your patch this, error occoured every night 1-4 times, while a bunch of clients with Acronis True Image Advanced Workstation had stored their backups on a Samba share. The clients are running Win2k and WinXP. As there is no other share and no user working on this machine, I can't say if there are other situations, this will occour.
Currently this is my first Samba 4x member server in production, so I can't say if it would happen on others, too.
I only can't provide a level 10 debug log I think, because the Acronis backup takes, depending on the machines 0.5 to 2.5 hours. I think, this would generate an enourmous log.
But if you have any idea how I can provide further help, just let me know.
Created attachment 9449 [details]
Patch for 4.1
Created attachment 9450 [details]
Patch for 4.0
Comment on attachment 9449 [details]
Patch for 4.1
Comment on attachment 9450 [details]
Patch for 4.0
Re-assigning to Karolin for inclusion in 4.1.next and 4.0.next.
Pushed to autobuild-v4-1-test and autobuild-v4-0-test.
Just to avoid confusion in the future: The patch currently attached to 10284 conflicts with the just-pushed patch from this bug.
(In reply to comment #17)
> Just to avoid confusion in the future: The patch currently attached to 10284
> conflicts with the just-pushed patch from this bug.
I reverted the patch from this bug report and tried the one from Bug #10284. That one also fixes this bug.
Pushed to v4-1-test and v4-0-test.
Closing out bug report.