Bug 10250 - Segfaults: PANIC: Bad talloc magic value - access after free
Summary: Segfaults: PANIC: Bad talloc magic value - access after free
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services (show other bugs)
Version: 4.1.0
Hardware: x64 Linux
: P5 major (vote)
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-11-05 06:17 UTC by Marc Muehlfeld
Modified: 2013-11-26 19:26 UTC (History)
2 users (show)

See Also:


Attachments
core dump file (758.65 KB, application/x-gzip)
2013-11-05 06:17 UTC, Marc Muehlfeld
no flags Details
gdb backtrace (8.88 KB, text/plain)
2013-11-05 06:17 UTC, Marc Muehlfeld
no flags Details
Level 1 logfile (6.28 KB, text/plain)
2013-11-05 06:17 UTC, Marc Muehlfeld
no flags Details
smb.conf (2.50 KB, application/octet-stream)
2013-11-05 06:18 UTC, Marc Muehlfeld
no flags Details
Patch (1.07 KB, patch)
2013-11-14 20:37 UTC, Volker Lendecke
no flags Details
Patch for 4.1 (1.30 KB, patch)
2013-11-20 08:18 UTC, Volker Lendecke
jra: review+
Details
Patch for 4.0 (1.30 KB, patch)
2013-11-20 08:19 UTC, Volker Lendecke
jra: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Marc Muehlfeld 2013-11-05 06:17:07 UTC
Created attachment 9374 [details]
core dump file

I added a Samba 4.1 member server to our Samba AD a few days ago. Since that time, every night I get have crashing smbd processes (Acronis stores Images from a bunch of workstations on a share during the night).


This is the (level 1) log output:

[2013/11/05 04:14:43.532768,  0, pid=91892] ../source3/lib/popt_common.c:67(popt_s3_talloc_log_fn)
  talloc: access after free error - first free may be at ../source3/smbd/open.c:1529
[2013/11/05 04:14:43.532916,  0, pid=91892] ../source3/lib/popt_common.c:67(popt_s3_talloc_log_fn)
  Bad talloc magic value - access after free
[2013/11/05 04:14:43.532976,  0, pid=91892] ../source3/lib/util.c:785(smb_panic_s3)
  PANIC (pid 91892): Bad talloc magic value - access after free
[2013/11/05 04:14:43.533704,  0, pid=91892] ../source3/lib/util.c:896(log_stack_trace)
  BACKTRACE: 22 stack frames:
   #0 /usr/local/samba/lib/libsmbconf.so.0(log_stack_trace+0x1f) [0x7f2db6366c06]
   #1 /usr/local/samba/lib/libsmbconf.so.0(smb_panic_s3+0x6d) [0x7f2db6366a75]
   #2 /usr/local/samba/lib/libsamba-util.so.0(smb_panic+0x28) [0x7f2db81d3cfb]
   #3 /usr/local/samba/lib/samba/libtalloc.so.2(+0x20a9) [0x7f2db760d0a9]
   #4 /usr/local/samba/lib/samba/libtalloc.so.2(+0x2125) [0x7f2db760d125]
   #5 /usr/local/samba/lib/samba/libtalloc.so.2(+0x21a3) [0x7f2db760d1a3]
   #6 /usr/local/samba/lib/samba/libtalloc.so.2(talloc_get_name+0x18) [0x7f2db760ec83]
   #7 /usr/local/samba/lib/samba/libtalloc.so.2(_talloc_get_type_abort+0x4c) [0x7f2db760ee03]
   #8 /usr/local/samba/lib/libsmbconf.so.0(+0x31767) [0x7f2db6372767]
   #9 /usr/local/samba/lib/samba/libtevent.so.0(tevent_common_loop_immediate+0x1f9) [0x7f2db73feee4]
   #10 /usr/local/samba/lib/libsmbconf.so.0(run_events_poll+0x57) [0x7f2db638319b]
   #11 /usr/local/samba/lib/libsmbconf.so.0(+0x42848) [0x7f2db6383848]
   #12 /usr/local/samba/lib/samba/libtevent.so.0(_tevent_loop_once+0xfc) [0x7f2db73fdfa9]
   #13 /usr/local/samba/lib/samba/libsmbd_base.so(smbd_process+0x1321) [0x7f2db7977a1e]
   #14 /usr/sbin/smbd(+0x9c38) [0x7f2db883cc38]
   #15 /usr/local/samba/lib/libsmbconf.so.0(run_events_poll+0x544) [0x7f2db6383688]
   #16 /usr/local/samba/lib/libsmbconf.so.0(+0x4295e) [0x7f2db638395e]
   #17 /usr/local/samba/lib/samba/libtevent.so.0(_tevent_loop_once+0xfc) [0x7f2db73fdfa9]
   #18 /usr/sbin/smbd(+0xa8d7) [0x7f2db883d8d7]
   #19 /usr/sbin/smbd(main+0x15d1) [0x7f2db883eff9]
   #20 /lib64/libc.so.6(__libc_start_main+0xfd) [0x7f2db4bfbcdd]
   #21 /usr/sbin/smbd(+0x5809) [0x7f2db8838809]
[2013/11/05 04:14:43.534362,  0, pid=91892] ../source3/lib/util.c:797(smb_panic_s3)
  smb_panic(): calling panic action [/usr/local/bin/panic-action 91892]
[2013/11/05 04:14:44.128587,  0, pid=91892] ../source3/lib/util.c:805(smb_panic_s3)
  smb_panic(): action returned status 0
[2013/11/05 04:14:44.128711,  0, pid=91892] ../source3/lib/dumpcore.c:317(dump_core)
  dumping core in /var/log/samba//cores/smbd




Find attached:
- Core dump file
- gdb backtrace
- Level 1 Logfile
- smb.conf
Comment 1 Marc Muehlfeld 2013-11-05 06:17:28 UTC
Created attachment 9375 [details]
gdb backtrace
Comment 2 Marc Muehlfeld 2013-11-05 06:17:56 UTC
Created attachment 9376 [details]
Level 1 logfile
Comment 3 Marc Muehlfeld 2013-11-05 06:18:11 UTC
Created attachment 9377 [details]
smb.conf
Comment 4 Nick Semenkovich 2013-11-14 19:07:55 UTC
Perhaps related to bug #9903?

Similar backtrace, though I haven't seen that crash in a long time (months).
Comment 5 Volker Lendecke 2013-11-14 19:14:13 UTC
Looking
Comment 6 Volker Lendecke 2013-11-14 20:37:46 UTC
Created attachment 9421 [details]
Patch

I don't have a reproducer yet, but I bet this is it.
Comment 7 Marc Muehlfeld 2013-11-14 21:11:10 UTC
(In reply to comment #6)
> Created attachment 9421 [details]
> Patch
> 
> I don't have a reproducer yet, but I bet this is it.

Thanks, Volker.
I applied the patch to my 4.1.0 and recompiled. I'll give feedback during the next days.
Comment 8 Marc Muehlfeld 2013-11-18 19:05:09 UTC
Volker, the patch works great. The segfaults are gone.
Thank you.
Comment 9 Volker Lendecke 2013-11-19 12:16:47 UTC
Ok, thanks for the feedback. I have no clue how to reproduce this, which worries me a bit. Do you have a client that is hit in particular by this?
Comment 10 Marc Muehlfeld 2013-11-19 13:16:43 UTC
(In reply to comment #9)
> Ok, thanks for the feedback. I have no clue how to reproduce this, which
> worries me a bit. Do you have a client that is hit in particular by this?

Without your patch this, error occoured every night 1-4 times, while a bunch of clients with Acronis True Image Advanced Workstation had stored their backups on a Samba share. The clients are running Win2k and WinXP. As there is no other share and no user working on this machine, I can't say if there are other situations, this will occour.

Currently this is my first Samba 4x member server in production, so I can't say if it would happen on others, too.

I only can't provide a level 10 debug log I think, because the Acronis backup takes, depending on the machines 0.5 to 2.5 hours. I think, this would generate an enourmous log.

But if you have any idea how I can provide further help, just let me know.
Comment 11 Volker Lendecke 2013-11-20 08:18:43 UTC
Created attachment 9449 [details]
Patch for 4.1
Comment 12 Volker Lendecke 2013-11-20 08:19:39 UTC
Created attachment 9450 [details]
Patch for 4.0
Comment 13 Jeremy Allison 2013-11-20 23:32:08 UTC
Comment on attachment 9449 [details]
Patch for 4.1

LGTM.
Comment 14 Jeremy Allison 2013-11-20 23:32:57 UTC
Comment on attachment 9450 [details]
Patch for 4.0

LGTM.
Comment 15 Jeremy Allison 2013-11-20 23:33:31 UTC
Re-assigning to Karolin for inclusion in 4.1.next and 4.0.next.

Jeremy.
Comment 16 Karolin Seeger 2013-11-22 10:28:23 UTC
Pushed to autobuild-v4-1-test and autobuild-v4-0-test.
Comment 17 Volker Lendecke 2013-11-22 11:12:09 UTC
Just to avoid confusion in the future: The patch currently attached to 10284 conflicts with the just-pushed patch from this bug.
Comment 18 Marc Muehlfeld 2013-11-26 07:35:26 UTC
(In reply to comment #17)
> Just to avoid confusion in the future: The patch currently attached to 10284
> conflicts with the just-pushed patch from this bug.

I reverted the patch from this bug report and tried the one from Bug #10284. That one also fixes this bug.
Comment 19 Karolin Seeger 2013-11-26 19:26:53 UTC
Pushed to v4-1-test and v4-0-test.
Closing out bug report.

Thanks!