Bug 11218 - smbd core while closing own connection from MMC
Summary: smbd core while closing own connection from MMC
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services (show other bugs)
Version: 4.2.0
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on: 11394
Blocks:
  Show dependency treegraph
 
Reported: 2015-04-14 20:59 UTC by Christof Schmitt
Modified: 2015-11-07 08:16 UTC (History)
2 users (show)

See Also:


Attachments
Patch for v4-2-test (1.25 KB, patch)
2015-06-08 10:53 UTC, Stefan Metzmacher
obnox: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Christof Schmitt 2015-04-14 20:59:11 UTC
Using the MMC to close the connection that the MMC is currently using
results in this smbd core:

(gdb) bt
#0  0x00007f73d77cc5d7 in raise () from /lib64/libc.so.6
#1  0x00007f73d77cdcc8 in abort () from /lib64/libc.so.6
#2  0x00007f73d8d0e68b in dump_core () at ../source3/lib/dumpcore.c:337
#3  0x00007f73d8d01577 in smb_panic_s3 (why=<optimized out>) at ../source3/lib/util.c:811
#4  0x00007f73daf668bf in smb_panic (why=0x7f73da60bc48 "Bad talloc magic value - access after free") at ../lib/util/fault.c:166
#5  0x00007f73da6020af in talloc_abort_access_after_free () at ../lib/talloc/talloc.c:359
#6  talloc_chunk_from_ptr (ptr=<optimized out>) at ../lib/talloc/talloc.c:380
#7  0x00007f73da602b7c in talloc_chunk_from_ptr (ptr=0x7f73dd06cde0) at ../lib/talloc/talloc.c:378
#8  __talloc_get_name (ptr=0x7f73dd06cde0) at ../lib/talloc/talloc.c:1360
#9  talloc_check_name (ptr=0x7f73dd06cde0, name=name@entry=0x7f73d88c45c7 "struct tsocket_address_bsd") at ../lib/talloc/talloc.c:1383
#10 0x00007f73d88bf6f0 in tsocket_address_bsd_string (addr=0x7f73dd088a70, mem_ctx=0x7f73dd0aa600) at ../lib/tsocket/tsocket_bsd.c:593
#11 0x00007f73dab431fb in close_cnum (conn=0x7f73dd086800, vuid=0) at ../source3/smbd/service.c:1130
#12 0x00007f73dab6cb54 in smbXsrv_tcon_disconnect (tcon=tcon@entry=0x7f73dd0c5d70, vuid=vuid@entry=0) at ../source3/smbd/smbXsrv_tcon.c:979
#13 0x00007f73dab6ce30 in smbXsrv_tcon_destructor (tcon=tcon@entry=0x7f73dd0c5d70) at ../source3/smbd/smbXsrv_tcon.c:688
#14 0x00007f73da608283 in _talloc_free_internal (ptr=0x7f73dd0c5d70, location=0x7f73dac92078 "../source3/smbd/server_exit.c:232") at ../lib/talloc/talloc.c:993
#15 0x00007f73da607f0b in _talloc_free_children_internal (tc=<optimized out>, location=0x7f73dac92078 "../source3/smbd/server_exit.c:232", ptr=0x7f73dd055d80) at ../lib/talloc/talloc.c:1466
#16 _talloc_free_internal (ptr=0x7f73dd055d80, location=0x7f73dac92078 "../source3/smbd/server_exit.c:232") at ../lib/talloc/talloc.c:1013
#17 0x00007f73da607f0b in _talloc_free_children_internal (tc=<optimized out>, location=0x7f73dac92078 "../source3/smbd/server_exit.c:232", ptr=0x7f73dd085420) at ../lib/talloc/talloc.c:1466
#18 _talloc_free_internal (ptr=0x7f73dd085420, location=0x7f73dac92078 "../source3/smbd/server_exit.c:232") at ../lib/talloc/talloc.c:1013
#19 0x00007f73da607f0b in _talloc_free_children_internal (tc=<optimized out>, location=0x7f73dac92078 "../source3/smbd/server_exit.c:232", ptr=0x7f73dd0585e0) at ../lib/talloc/talloc.c:1466
#20 _talloc_free_internal (ptr=0x7f73dd0585e0, location=0x7f73dac92078 "../source3/smbd/server_exit.c:232") at ../lib/talloc/talloc.c:1013
#21 0x00007f73da6023f3 in _talloc_free_children_internal (tc=<optimized out>, location=0x7f73dac92078 "../source3/smbd/server_exit.c:232", ptr=0x7f73dd057820) at ../lib/talloc/talloc.c:1466
#22 _talloc_free_internal (location=<optimized out>, ptr=<optimized out>) at ../lib/talloc/talloc.c:1013
#23 _talloc_free (ptr=0x7f73dd057820, location=0x7f73dac92078 "../source3/smbd/server_exit.c:232") at ../lib/talloc/talloc.c:1581
#24 0x00007f73dab6fcbd in exit_server_common (how=how@entry=SERVER_EXIT_NORMAL, reason=0x0) at ../source3/smbd/server_exit.c:232
#25 0x00007f73dab700ce in smbd_exit_server_cleanly (explanation=<optimized out>) at ../source3/smbd/server_exit.c:266
#26 0x00007f73d86afc02 in exit_server_cleanly (reason=reason@entry=0x0) at ../source3/lib/smbd_shim.c:131
#27 0x00007f73db5ca30a in msg_exit_server (msg=<optimized out>, private_data=<optimized out>, msg_type=<optimized out>, server_id=..., data=<optimized out>) at ../source3/smbd/server.c:149
#28 0x00007f73d8d098cd in messaging_defer_callback_trigger (ev=<optimized out>, im=<optimized out>, private_data=<optimized out>) at ../source3/lib/messages.c:865
#29 0x00007f73da1f4824 in tevent_common_loop_immediate (ev=ev@entry=0x7f73dd054ea0) at ../lib/tevent/tevent_immediate.c:135
#30 0x00007f73d8d16b1c in run_events_poll (ev=0x7f73dd054ea0, pollrtn=0, pfds=0x0, num_pfds=0) at ../source3/lib/events.c:192
#31 0x00007f73d8d16e04 in s3_event_loop_once (ev=0x7f73dd054ea0, location=<optimized out>) at ../source3/lib/events.c:303
#32 0x00007f73da1f3ffd in _tevent_loop_once (ev=ev@entry=0x7f73dd054ea0, location=location@entry=0x7f73dac7a2a0 "../source3/smbd/process.c:4106") at ../lib/tevent/tevent.c:533
#33 0x00007f73da1f419b in tevent_common_loop_wait (ev=0x7f73dd054ea0, location=0x7f73dac7a2a0 "../source3/smbd/process.c:4106") at ../lib/tevent/tevent.c:637
#34 0x00007f73dab40641 in smbd_process (ev_ctx=ev_ctx@entry=0x7f73dd054ea0, msg_ctx=msg_ctx@entry=0x7f73dd054f90, sock_fd=sock_fd@entry=43, interactive=interactive@entry=false) at ../source3/smbd/process.c:4106
#35 0x00007f73db5cb27c in smbd_accept_connection (ev=0x7f73dd054ea0, fde=<optimized out>, flags=<optimized out>, private_data=<optimized out>) at ../source3/smbd/server.c:750
#36 0x00007f73d8d16c4c in run_events_poll (ev=0x7f73dd054ea0, pollrtn=<optimized out>, pfds=0x7f73dd063d80, num_pfds=8) at ../source3/lib/events.c:257
#37 0x00007f73d8d16ea0 in s3_event_loop_once (ev=0x7f73dd054ea0, location=<optimized out>) at ../source3/lib/events.c:326
#38 0x00007f73da1f3ffd in _tevent_loop_once (ev=ev@entry=0x7f73dd054ea0, location=location@entry=0x7f73db5cdea4 "../source3/smbd/server.c:1115") at ../lib/tevent/tevent.c:533
#39 0x00007f73da1f419b in tevent_common_loop_wait (ev=0x7f73dd054ea0, location=0x7f73db5cdea4 "../source3/smbd/server.c:1115") at ../lib/tevent/tevent.c:637
#40 0x00007f73db5c76b0 in smbd_parent_loop (parent=<optimized out>, ev_ctx=0x7f73dd054ea0) at ../source3/smbd/server.c:1115
#41 main (argc=<optimized out>, argv=<optimized out>) at ../source3/smbd/server.c:1757
(gdb) list source3/smbd/service.c:1130
1125                    dptr_closecnum(conn);
1126            }
1127
1128            change_to_root_user();
1129
1130            DEBUG(IS_IPC(conn)?3:2, ("%s (%s) closed connection to service %s\n",
1131                                     get_remote_machine_name(),
1132                                     tsocket_address_string(conn->sconn->remote_address,
1133                                                            talloc_tos()),
1134                                     lp_servicename(talloc_tos(), SNUM(conn))));
Comment 1 Jeremy Allison 2015-04-14 21:14:09 UTC
Can you reproduce this under valgrind and find out where we're first freeing this memory ?
Comment 2 Christof Schmitt 2015-04-14 22:12:01 UTC
What do we usually need? Something like this?
valgrind --log-file=/tmp/valgrind.log --trace-children=yes /usr/local/sbin/smbd
Comment 3 Christof Schmitt 2015-04-14 22:13:01 UTC
One approach could be not printing the IP address in the debug message:
--- a/source3/smbd/service.c
+++ b/source3/smbd/service.c
@@ -1127,10 +1127,8 @@ void close_cnum(connection_struct *conn, uint64_t vuid)
 
 	change_to_root_user();
 
-	DEBUG(IS_IPC(conn)?3:2, ("%s (%s) closed connection to service %s\n",
+	DEBUG(IS_IPC(conn)?3:2, ("%s closed connection to service %s\n",
 				 get_remote_machine_name(),
-				 tsocket_address_string(conn->sconn->remote_address,
-							talloc_tos()),
 				 lp_servicename(talloc_tos(), SNUM(conn))));
 
 	/* make sure we leave the directory available for unmount */

But we can try finding out first where the memory is freed.
Comment 4 Christof Schmitt 2015-04-14 23:13:08 UTC
==15505== 1 errors in context 1 of 1:
==15505== Invalid read of size 8
==15505==    at 0x53DEBE0: exit_server_common (server_exit.c:222)
==15505==    by 0x53DF0CD: smbd_exit_server_cleanly (server_exit.c:266)
==15505==    by 0x791FC01: exit_server_cleanly (smbd_shim.c:131)
==15505==    by 0x112309: msg_exit_server (server.c:149)
==15505==    by 0x72A98CC: messaging_defer_callback_trigger (messages.c:865)
==15505==    by 0x5DD7823: tevent_common_loop_immediate (tevent_immediate.c:135)
==15505==    by 0x72B6B1B: run_events_poll (events.c:192)
==15505==    by 0x72B6E03: s3_event_loop_once (events.c:303)
==15505==    by 0x5DD6FFC: _tevent_loop_once (tevent.c:533)
==15505==    by 0x5DD719A: tevent_common_loop_wait (tevent.c:637)
==15505==    by 0x53AF640: smbd_process (process.c:4106)
==15505==    by 0x11327B: smbd_accept_connection (server.c:750)
==15505==  Address 0x1484ac18 is 104 bytes inside a block of size 504 free'd
==15505==    at 0x4C2AD17: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==15505==    by 0x59C44B2: _talloc_free_internal (talloc.c:1057)
==15505==    by 0x59C44B2: _talloc_free (talloc.c:1581)
==15505==    by 0x53DEC4D: exit_server_common (server_exit.c:224)
==15505==    by 0x53DF0CD: smbd_exit_server_cleanly (server_exit.c:266)
==15505==    by 0x791FC01: exit_server_cleanly (smbd_shim.c:131)
==15505==    by 0x112309: msg_exit_server (server.c:149)
==15505==    by 0x72A98CC: messaging_defer_callback_trigger (messages.c:865)
==15505==    by 0x5DD7823: tevent_common_loop_immediate (tevent_immediate.c:135)
==15505==    by 0x72B6B1B: run_events_poll (events.c:192)
==15505==    by 0x72B6E03: s3_event_loop_once (events.c:303)
==15505==    by 0x5DD6FFC: _tevent_loop_once (tevent.c:533)
==15505==    by 0x5DD719A: tevent_common_loop_wait (tevent.c:637)
==15505== 
--15505--
Comment 5 Christof Schmitt 2015-04-14 23:17:34 UTC
 215
216         /*
217          * we need to force the order of freeing the following,
218          * because smbd_msg_ctx is not a talloc child of smbd_server_conn.
219          */
220         if (client != NULL) {
221                 for (; xconn != NULL; xconn = xconn->next) {
222                         DLIST_REMOVE(client->connections, xconn);
223                         talloc_free(xconn);
224                 }
225                 TALLOC_FREE(client->sconn);
226         }
227         sconn = NULL;
228         xconn = NULL;
229         client = NULL;
230         TALLOC_FREE(global_smbXsrv_client);
231         server_messaging_context_free();
232         server_event_context_free();
233         TALLOC_FREE(smbd_memcache_ctx);
Comment 6 Stefan Metzmacher 2015-06-08 10:53:48 UTC
Created attachment 11128 [details]
Patch for v4-2-test
Comment 7 Stefan Metzmacher 2015-06-08 10:59:09 UTC
(In reply to Christof Schmitt from comment #4)

A patch for the problem reported by valgrind is attached.

But the original problem may still exit.

Can you retest with the patch applied?
Comment 8 Karolin Seeger 2015-06-09 19:35:49 UTC
Patch does not apply on current v4-2-test:

user@host:/data/git/samba/v4-2-test$ git am tmp42.diff
Wende an: Update the tevent_data.dox tutrial stuff to fix some errors, including white space problems.
error: Anwendung des Patches fehlgeschlagen: lib/tevent/doc/tevent_data.dox:46
error: lib/tevent/doc/tevent_data.dox: Patch konnte nicht angewendet werden
Anwendung des Patches fehlgeschlagen bei 0001 Update the tevent_data.dox tutrial stuff to fix some errors, including white space problems.
Comment 9 Stefan Metzmacher 2015-06-09 23:51:34 UTC
(In reply to Karolin Seeger from comment #8)

This error message doesn't belong to the attachment.
Could it be a tmp42.diff vs. tmp42.diff.txt problem?

The following file would be correct:

md5sum tmp42.diff.txt 
fe44c62855c9dbc3d1ca9b1989541017  tmp42.diff.txt
Comment 10 Karolin Seeger 2015-06-15 19:53:19 UTC
(In reply to Stefan (metze) Metzmacher from comment #9)
However, it applies now, sorry!
Pushed to autobuild-v4-2-test.
Comment 11 Karolin Seeger 2015-06-17 18:02:54 UTC
(In reply to Karolin Seeger from comment #10)
Pushed to v4-2-test.
Closing out bug report.

Thanks!
Comment 12 Stefan Metzmacher 2015-11-04 13:54:25 UTC
(In reply to Christof Schmitt from comment #0)

Christof: Are you able to reproduce this problem reliable?
I have a report that a backtrace like yours also happens
with 4.2.4 (which includes my fix)?

Any hint would be highly welcome!

Thanks!
Comment 13 Christof Schmitt 2015-11-04 23:12:18 UTC
(In reply to Stefan Metzmacher from comment #12)
As far as i remember, the recreate was to first switch to at least log level 3 for smbd, then access the MMC from a Windows client and "close" the session that the MMC is using. I don't have a test system where this recreates; i can no longer recreate it with Samba 4.3 (which has the fix posted earlier).
Comment 14 Stefan Metzmacher 2015-11-07 08:16:39 UTC
(In reply to Christof Schmitt from comment #13)

I tried that several times, but could only reproduce the valgrind backtrace
but not the original one.