Bug 12852 - SIGSEGV in cm_connect_lsa_tcp dereferencing conn->lsa_tcp_pipe->transport after error
SIGSEGV in cm_connect_lsa_tcp dereferencing conn->lsa_tcp_pipe->transport aft...
Status: RESOLVED FIXED
Product: Samba 4.1 and newer
Classification: Unclassified
Component: DCE-RPCs and pipes
4.5.9
All All
: P5 normal
: ---
Assigned To: Karolin Seeger
Samba QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2017-06-20 12:37 UTC by Richard Sharpe
Modified: 2017-06-27 10:49 UTC (History)
2 users (show)

See Also:


Attachments
Patch for master (1.19 KB, patch)
2017-06-20 19:48 UTC, Richard Sharpe
no flags Details
git-am fix for 4.6.next, 4.5.next. (1.48 KB, patch)
2017-06-22 21:34 UTC, Jeremy Allison
rsharpe: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Richard Sharpe 2017-06-20 12:37:41 UTC
We have gotten several crashes in winbind like the following:

#0  0x00007fee31f411d7 in raise () from /lib64/libc.so.6
#1  0x00007fee31f428c8 in abort () from /lib64/libc.so.6
#2  0x00007fee3518463b in dump_core () at ../source3/lib/dumpcore.c:322
#3  0x00007fee35175cf7 in smb_panic_s3 (why=<optimized out>)
    at ../source3/lib/util.c:814
#4  0x00007fee388a512f in smb_panic (why=why@entry=0x7fee388f26ab "internal error")
    at ../lib/util/fault.c:166
#5  0x00007fee388a5346 in fault_report (sig=<optimized out>) at ../lib/util/fault.c:83
#6  sig_fault (sig=<optimized out>) at ../lib/util/fault.c:94
#7  <signal handler called>
#8  cm_connect_lsa_tcp (domain=domain@entry=0x7fee3adffe60,
    cli=cli@entry=0x7ffde8610e88, mem_ctx=0x7fee3ae11090)
    at ../source3/winbindd/winbindd_cm.c:2982
#9  0x00007fee3a783724 in cm_connect_lsat (domain=domain@entry=0x7fee3adffe60,
    mem_ctx=mem_ctx@entry=0x7fee3ae11090, cli=cli@entry=0x7ffde8610e88,
    lsa_policy=lsa_policy@entry=0x7ffde8610e90)
    at ../source3/winbindd/winbindd_cm.c:3232
#10 0x00007fee3a7862b7 in winbindd_lookup_sids (mem_ctx=mem_ctx@entry=0x7fee3ae11090,
    domain=domain@entry=0x7fee3adffe60, num_sids=num_sids@entry=1,
    sids=0x7fee3ae1b340, domains=domains@entry=0x7ffde8610f10,
    names=names@entry=0x7ffde8610f18, types=types@entry=0x7ffde8610f20)
    at ../source3/winbindd/winbindd_msrpc.c:1103
#11 0x00007fee3a78665c in msrpc_sid_to_name (domain=0x7fee3adffe60,
    mem_ctx=0x7fee3ae11090, sid=<optimized out>, domain_name=0x7ffde8611040,
    name=0x7ffde8611048, type=0x7ffde861103c)
    at ../source3/winbindd/winbindd_msrpc.c:305
#12 0x00007fee3a78949e in sid_to_name (domain=<optimized out>,
    mem_ctx=<optimized out>, sid=<optimized out>, domain_name=<optimized out>,
    name=<optimized out>, type=<optimized out>)
    at ../source3/winbindd/winbindd_ads.c:586
#13 0x00007fee3a78925f in sid_to_name (domain=0x7fee3adffe60, mem_ctx=0x7fee3ae11090,
    sid=0x7fee3ae1b340, domain_name=0x7ffde8611040, name=0x7ffde8611048,
    type=0x7ffde861103c) at ../source3/winbindd/winbindd_reconnect_ads.c:126
#14 0x00007fee3a7721de in sid_to_name (domain=0x7fee3adffe60, mem_ctx=0x7fee3ae11090,
    sid=0x7fee3ae1b340, domain_name=0x7ffde8611040, name=0x7ffde8611048,
    type=0x7ffde861103c) at ../source3/winbindd/winbindd_cache.c:1989
#15 0x00007fee3a793507 in _wbint_LookupSid (p=p@entry=0x7ffde86110b0,
    r=r@entry=0x7fee3ae13050) at ../source3/winbindd/winbindd_dual_srv.c:65
#16 0x00007fee3a7dde13 in api_wbint_LookupSid (p=0x7ffde86110b0)
    at default/librpc/gen_ndr/srv_winbind.c:144
#17 0x00007fee3a793394 in winbindd_dual_ndrcmd (domain=<optimized out>,
    state=0x7ffde8611348) at ../source3/winbindd/winbindd_dual_ndr.c:322
#18 0x00007fee3a79012c in child_process_request (child=<optimized out>,
    child=<optimized out>, state=0x7ffde8611348)
    at ../source3/winbindd/winbindd_dual.c:513
#19 child_handler (ev=<optimized out>, fde=<optimized out>, flags=<optimized out>,
    private_data=0x7ffde8611340) at ../source3/winbindd/winbindd_dual.c:1393
#20 0x00007fee37e0fc6b in epoll_event_loop (tvalp=0x7ffde8611260,
    epoll_ev=0x7fee3adffa50) at ../lib/tevent/tevent_epoll.c:728
#21 epoll_event_loop_once (ev=<optimized out>, location=<optimized out>)
    at ../lib/tevent/tevent_epoll.c:926
#22 0x00007fee37e0e137 in std_event_loop_once (ev=0x7fee3adf47f0,
    location=0x7fee3a7fec70 "../source3/winbindd/winbindd_dual.c:1592")
    at ../lib/tevent/tevent_standard.c:114
#23 0x00007fee37e0a38d in _tevent_loop_once (ev=0x7fee3adf47f0,
    location=location@entry=0x7fee3a7fec70 "../source3/winbindd/winbindd_dual.c:1592")
    at ../lib/tevent/tevent.c:533
#24 0x00007fee3a7924f8 in fork_domain_child (child=0x7fee3ae000b0)
    at ../source3/winbindd/winbindd_dual.c:1592
#25 0x00007fee3a792bc5 in wb_child_request_trigger (req=0x7fee3ae08650,
    private_data=<optimized out>) at ../source3/winbindd/winbindd_dual.c:173
#26 0x00007fee37e0abb4 in tevent_common_loop_immediate (ev=ev@entry=0x7fee3adf47f0)
    at ../lib/tevent/tevent_immediate.c:135
#27 0x00007fee37e0fa2e in epoll_event_loop_once (ev=0x7fee3adf47f0,
    location=<optimized out>) at ../lib/tevent/tevent_epoll.c:907
#28 0x00007fee37e0e137 in std_event_loop_once (ev=0x7fee3adf47f0,
    location=0x7fee3a7e8c48 "../source3/winbindd/winbindd.c:1809")
    at ../lib/tevent/tevent_standard.c:114
#29 0x00007fee37e0a38d in _tevent_loop_once (ev=0x7fee3adf47f0,
    location=location@entry=0x7fee3a7e8c48 "../source3/winbindd/winbindd.c:1809")
    at ../lib/tevent/tevent.c:533
#30 0x00007fee3a758cfc in main (argc=<optimized out>, argv=<optimized out>)
    at ../source3/winbindd/winbindd.c:1809
Comment 1 Richard Sharpe 2017-06-20 12:40:02 UTC
Here is the code, it is the same in mainline:

      if (conn->lsa_pipe_tcp &&
          conn->lsa_pipe_tcp->transport->transport == NCACN_IP_TCP &&
          conn->lsa_pipe_tcp->auth->auth_level >= DCERPC_AUTH_LEVEL_INTEGRITY &&
          rpccli_is_connected(conn->lsa_pipe_tcp)) {
            goto done;
      }

and here is what conn->lsa_pipe_tcp looks like:

(gdb) p *conn->lsa_pipe_tcp
$4 = {prev = 0x0, next = 0x0, transport = 0x0, binding_handle = 0x7f15e8bf2220,
  abstract_syntax = {uuid = {time_low = 305420152, time_mid = 4660,
      time_hi_and_version = 43981, clock_seq = <incomplete sequence \357>,
      node = "\001#Eg\211\253"}, if_version = 0}, transfer_syntax = {uuid = {
      time_low = 2324192516, time_mid = 7403, time_hi_and_version = 4553,
      clock_seq = "\237", <incomplete sequence \350>, node = "\b\000+\020H`"},
Comment 2 Richard Sharpe 2017-06-20 12:53:41 UTC
In debugging this with Jeremy, I pointed out that we also see this in the log before the crash but I did not record how long before.

[2017/06/13 17:48:16.110444,  0] ../source3/rpc_client/cli_pipe.c:3326(cli_rpc_pipe_open_schannel_with_creds)
  netlogon_creds_cli_check failed with NT_STATUS_RPC_SEC_PKG_ERROR

We (Jeremy and I) noticed that cli_pipe.c:rpc_api_pipe_got_pdu has code like:

        } else if (NT_STATUS_EQUAL(status, NT_STATUS_RPC_SEC_PKG_ERROR)) {
                /*
                 * TODO: do a real async disconnect ...
                 *
                 * For now do it sync...
                 */
                TALLOC_FREE(state->cli->transport);
        }

Thus there seems like a code path into cm_connect_lsa_tcp where conn->lsa_pipe_tcp->transport is NULL and the fix is to add another term to the check.

I have rebuilt our version of Samba and we are running a test now to see if we can still hit the issue.
Comment 3 Stefan Metzmacher 2017-06-20 14:04:14 UTC
(In reply to Richard Sharpe from comment #2)

I guess we need to change the order and
call rpccli_is_connected() first.


      if (rpccli_is_connected(conn->lsa_pipe_tcp) &&
          conn->lsa_pipe_tcp->transport->transport == NCACN_IP_TCP &&
          conn->lsa_pipe_tcp->auth->auth_level >= DCERPC_AUTH_LEVEL_INTEGRITY)
      {
            goto done;
      }
Comment 4 Richard Sharpe 2017-06-20 16:21:47 UTC
Good point. I will try that.
Comment 5 Richard Sharpe 2017-06-20 19:48:06 UTC
Created attachment 13297 [details]
Patch for master

This is a patch based on Metze's suggestion.
Comment 6 Jeremy Allison 2017-06-22 21:34:42 UTC
Created attachment 13304 [details]
git-am fix for 4.6.next, 4.5.next.

Cherry-picked from master.
Comment 7 Jeremy Allison 2017-06-22 21:45:40 UTC
Re-assigning to Karolin for inclusion in 4.6.next, 4.5.next.
Comment 8 Karolin Seeger 2017-06-23 11:28:57 UTC
(In reply to Jeremy Allison from comment #7)
Pushed to autobuild-v4-{6,5}-test.
Comment 9 Karolin Seeger 2017-06-27 10:49:17 UTC
(In reply to Karolin Seeger from comment #8)
Pushed to both branches.
Closing out bug report.

Thanks!