Bug 11720 - NT_STATUS_INTERNAL_ERROR and coredump at server exit when path on NFS
NT_STATUS_INTERNAL_ERROR and coredump at server exit when path on NFS
Status: ASSIGNED
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services
4.2.3
x64 Linux
: P5 normal
: ---
Assigned To: Jeremy Allison
Samba QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2016-02-08 16:04 UTC by Heinrich Mislik
Modified: 2016-06-16 10:01 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Heinrich Mislik 2016-02-08 16:04:31 UTC
When the path of a service points to a directory mounted via NFS the following error occurs at disconnect:

[2016/02/01 04:20:37.135774,  0, pid=14590] ../source3/smbd/smbXsrv_tcon.c:974(smbXsrv_tcon_disconne
ct)
  smbXsrv_tcon_disconnect(0x1d2dc409, 'someuser'): set_current_service() failed: NT_STATUS_INTERNAL_ERROR

followed by a core dump. Reason is failing to chdir to the path due to root_squash on NFS. This does not happen every time. It depends on the state of the process. One sure way to produce this error is to immediatly disconnect after connect (something like smbclient -c 'exit').
Comment 1 Jeremy Allison 2016-02-09 01:13:01 UTC
Can you get a stack backtrace and post here for the coredump please.

Even though this is an error condition we shouldn't core.
Comment 2 Heinrich Mislik 2016-02-09 10:06:16 UTC
[2016/02/09 11:01:07.965081,  0, pid=11729] ../source3/smbd/smbXsrv_tcon.c:974(smbXsrv_tcon_disconne
ct)
  smbXsrv_tcon_disconnect(0x9f9fe1f1, 'someuser'): set_current_service() failed: NT_STATUS_INTERNAL_ERROR
[2016/02/09 11:01:07.965277,  0, pid=11729] ../source3/smbd/smbXsrv_tcon.c:1023(smbXsrv_tcon_disconn
ect_all)
  smbXsrv_tcon_disconnect_all: count[1] errors[1] first[NT_STATUS_INTERNAL_ERROR]
[2016/02/09 11:01:07.965315,  0, pid=11729] ../source3/smbd/server_exit.c:147(exit_server_common)
  Server exit (normal exit)
[2016/02/09 11:01:07.965366,  0, pid=11729] ../source3/smbd/server_exit.c:150(exit_server_common)
  exit_server_common: smb1srv_tcon_disconnect_all() failed (NT_STATUS_INTERNAL_ERROR) - triggering cleanup
[2016/02/09 11:01:07.965814,  0, pid=11729] ../source3/lib/util.c:788(smb_panic_s3)
  PANIC (pid 11729): smb1srv_tcon_disconnect_all failed
[2016/02/09 11:01:07.966465,  0, pid=11729] ../source3/lib/util.c:899(log_stack_trace)
  BACKTRACE: 22 stack frames:
   #0 /lib64/libsmbconf.so.0(log_stack_trace+0x1a) [0x7f30cd7bdcea]
   #1 /lib64/libsmbconf.so.0(smb_panic_s3+0x20) [0x7f30cd7bddc0]
   #2 /lib64/libsamba-util.so.0(smb_panic+0x2f) [0x7f30cf60d87f]
   #3 /usr/lib64/samba/libsmbd-base-samba4.so(+0x15f755) [0x7f30cf214755]
   #4 /usr/lib64/samba/libsmbd-base-samba4.so(+0x15fa4e) [0x7f30cf214a4e]
   #5 /usr/lib64/samba/libsmbd-shim-samba4.so(exit_server_cleanly+0x12) [0x7f30cd16cba2]
   #6 /usr/sbin/smbd(+0xa00a) [0x7f30cfc7100a]
   #7 /lib64/libsmbconf.so.0(+0x295dd) [0x7f30cd7c65dd]
   #8 /lib64/libtevent.so.0(tevent_common_loop_immediate+0xd4) [0x7f30cc1fc924]
   #9 /lib64/libsmbconf.so.0(run_events_poll+0x3c) [0x7f30cd7d37ec]
   #10 /lib64/libsmbconf.so.0(+0x36ad4) [0x7f30cd7d3ad4]
   #11 /lib64/libtevent.so.0(_tevent_loop_once+0x8d) [0x7f30cc1fc0fd]
   #12 /lib64/libtevent.so.0(tevent_common_loop_wait+0x1b) [0x7f30cc1fc29b]
   #13 /usr/lib64/samba/libsmbd-base-samba4.so(smbd_process+0x6d9) [0x7f30cf1e5cb9]
   #14 /usr/sbin/smbd(+0xae44) [0x7f30cfc71e44]
   #15 /lib64/libsmbconf.so.0(run_events_poll+0x16c) [0x7f30cd7d391c]
   #16 /lib64/libsmbconf.so.0(+0x36b70) [0x7f30cd7d3b70]
   #17 /lib64/libtevent.so.0(_tevent_loop_once+0x8d) [0x7f30cc1fc0fd]
   #18 /lib64/libtevent.so.0(tevent_common_loop_wait+0x1b) [0x7f30cc1fc29b]
   #19 /usr/sbin/smbd(main+0x1501) [0x7f30cfc6e361]
   #20 /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f30cbe58b15]
   #21 /usr/sbin/smbd(+0x7765) [0x7f30cfc6e765]
Comment 3 Jeremy Allison 2016-02-09 19:29:24 UTC
Ok, this isn't the trace I needed, but thank for trying.

To get the trace, add the line:

panic action = /bin/sleep 999999

into the [global] section of your smb.conf.

Reproduce the problem. Use ps to look for the sleep process, then find the pid
of its parent.

Then type:

gdb -p <pid-of-crashed-process> /usr/sbin/smbd

Then type "bt" to get the backtrace. Ensure you have symbols loaded.

Thanks !
Comment 4 Heinrich Mislik 2016-02-10 15:38:11 UTC
#0  0x00007f36a773003c in waitpid () from /lib64/libc.so.6
#1  0x00007f36a76b5092 in do_system () from /lib64/libc.so.6
#2  0x00007f36a8ff9e11 in smb_panic_s3 (why=<optimized out>) at ../source3/lib/util.c:801
#3  0x00007f36aae498cf in smb_panic (
    why=why@entry=0x7f36aab727f0 "smb1srv_tcon_disconnect_all failed") at ../lib/util/fault.c:166
#4  0x00007f36aaa50785 in exit_server_common (how=SERVER_EXIT_ABNORMAL, 
    how@entry=SERVER_EXIT_NORMAL, reason=0x7f36aab727f0 "smb1srv_tcon_disconnect_all failed")
    at ../source3/smbd/server_exit.c:243
#5  0x00007f36aaa50a7e in smbd_exit_server_cleanly (explanation=<optimized out>)
    at ../source3/smbd/server_exit.c:266
#6  0x00007f36a89a8ba2 in exit_server_cleanly (
    reason=reason@entry=0x7f36aab5814a "failed to receive smb request")
    at ../source3/lib/smbd_shim.c:131
#7  0x00007f36aaa20ac4 in smbd_server_connection_read_handler (xconn=0x7f36abd5db90, fd=39)
    at ../source3/smbd/process.c:2522
#8  0x00007f36a900f91c in run_events_poll (ev=0x7f36abd44c10, pollrtn=<optimized out>, 
    pfds=0x7f36abd4d070, num_pfds=4) at ../source3/lib/events.c:257
#9  0x00007f36a900fb70 in s3_event_loop_once (ev=0x7f36abd44c10, location=<optimized out>)
    at ../source3/lib/events.c:326
#10 0x00007f36a7a380fd in _tevent_loop_once (ev=ev@entry=0x7f36abd44c10, 
    location=location@entry=0x7f36aab5ab70 "../source3/smbd/process.c:3997") at ../tevent.c:533
#11 0x00007f36a7a3829b in tevent_common_loop_wait (ev=0x7f36abd44c10, 
    location=0x7f36aab5ab70 "../source3/smbd/process.c:3997") at ../tevent.c:637
#12 0x00007f36aaa21ce9 in smbd_process (ev_ctx=ev_ctx@entry=0x7f36abd44c10, 
    msg_ctx=msg_ctx@entry=0x7f36abd44d00, sock_fd=sock_fd@entry=39, 
    interactive=interactive@entry=false) at ../source3/smbd/process.c:3997
#13 0x00007f36ab4ade44 in smbd_accept_connection (ev=0x7f36abd44c10, fde=<optimized out>, 
    flags=<optimized out>, private_data=<optimized out>) at ../source3/smbd/server.c:627
#14 0x00007f36a900f91c in run_events_poll (ev=0x7f36abd44c10, pollrtn=<optimized out>, 
    pfds=0x7f36abd4d070, num_pfds=6) at ../source3/lib/events.c:257
#15 0x00007f36a900fb70 in s3_event_loop_once (ev=0x7f36abd44c10, location=<optimized out>)
    at ../source3/lib/events.c:326
#16 0x00007f36a7a380fd in _tevent_loop_once (ev=ev@entry=0x7f36abd44c10, 
    location=location@entry=0x7f36ab4b0847 "../source3/smbd/server.c:985") at ../tevent.c:533
#17 0x00007f36a7a3829b in tevent_common_loop_wait (ev=0x7f36abd44c10, 
    location=0x7f36ab4b0847 "../source3/smbd/server.c:985") at ../tevent.c:637
#18 0x00007f36ab4aa361 in smbd_parent_loop (parent=<optimized out>, ev_ctx=0x7f36abd44c10)
    at ../source3/smbd/server.c:985
#19 main (argc=<optimized out>, argv=<optimized out>) at ../source3/smbd/server.c:1626
Comment 5 Michael Geiger 2016-06-16 10:01:53 UTC
Hi everybody, 

I can confirm this issue. 

Two machines with RHEL 7.2 and Samba Version 4.2.10 has the same problem. 

If the chdir to a folder is failing because of an permission denied error (User tries to access with root, but NFS is exported with root_squash); smbd throw an NT_STATUS_INTERNAL_ERROR then NT_STATUS_IO_TIMEOUT and do an core dump. 


Here is our log with log level = 10:

[2016/06/15 17:01:30.017192,  5, pid=50150, effective(0, 0), real(0, 0)] ../libcli/security/security_token.c:53(security_token_debug)
  Security token: (NULL)
[2016/06/15 17:01:30.017198,  5, pid=50150, effective(0, 0), real(0, 0)] ../source3/auth/token_util.c:639(debug_unix_user_token)
  UNIX token of user 0
  Primary group is 0 and contains 0 supplementary groups
[2016/06/15 17:01:30.017210,  5, pid=50150, effective(0, 0), real(0, 0)] ../source3/smbd/uid.c:425(smbd_change_to_root_user)
  change_to_root_user: now uid=(0,0) gid=(0,0)
[2016/06/15 17:01:30.017217,  2, pid=50150, effective(0, 0), real(0, 0)] ../source3/smbd/service.c:1138(close_cnum)
  <ip> (ipv4:<ip>:57755) closed connection to service copra
[2016/06/15 17:01:30.017230,  4, pid=50150, effective(0, 0), real(0, 0), class=vfs] ../source3/smbd/vfs.c:858(vfs_ChDir)
  vfs_ChDir to /
[2016/06/15 17:01:30.017241,  4, pid=50150, effective(0, 0), real(0, 0), class=vfs] ../source3/smbd/vfs.c:869(vfs_ChDir)
  vfs_ChDir got /
[2016/06/15 17:01:30.017248,  4, pid=50150, effective(0, 0), real(0, 0)] ../source3/smbd/sec_ctx.c:316(set_sec_ctx)
  setting sec ctx (0, 0) - sec_ctx_stack_ndx = 0
[2016/06/15 17:01:30.017255,  5, pid=50150, effective(0, 0), real(0, 0)] ../libcli/security/security_token.c:53(security_token_debug)
  Security token: (NULL)
[2016/06/15 17:01:30.017260,  5, pid=50150, effective(0, 0), real(0, 0)] ../source3/auth/token_util.c:639(debug_unix_user_token)
  UNIX token of user 0
  Primary group is 0 and contains 0 supplementary groups
[2016/06/15 17:01:30.017270,  5, pid=50150, effective(0, 0), real(0, 0)] ../source3/smbd/uid.c:425(smbd_change_to_root_user)
  change_to_root_user: now uid=(0,0) gid=(0,0)
[2016/06/15 17:01:30.017292,  5, pid=50150, effective(0, 0), real(0, 0)] ../lib/dbwrap/dbwrap.c:178(dbwrap_check_lock_order)
  check lock order 1 for /var/lib/samba/smbXsrv_tcon_global.tdb
[2016/06/15 17:01:30.017301, 10, pid=50150, effective(0, 0), real(0, 0)] ../lib/dbwrap/dbwrap.c:133(debug_lock_order)
  lock order:  1:/var/lib/samba/smbXsrv_tcon_global.tdb 2:<none> 3:<none>
[2016/06/15 17:01:30.017310, 10, pid=50150, effective(0, 0), real(0, 0)] ../lib/dbwrap/dbwrap_tdb.c:60(db_tdb_log_key)
  Locking key A03AAA59
[2016/06/15 17:01:30.017320, 10, pid=50150, effective(0, 0), real(0, 0)] ../lib/dbwrap/dbwrap_tdb.c:144(db_tdb_fetch_locked_internal)
  Allocated locked data 0x0x7fe263673060
[2016/06/15 17:01:30.017341, 10, pid=50150, effective(0, 0), real(0, 0)] ../lib/dbwrap/dbwrap_tdb.c:60(db_tdb_log_key)
  Unlocking key A03AAA59
[2016/06/15 17:01:30.017349,  5, pid=50150, effective(0, 0), real(0, 0)] ../lib/dbwrap/dbwrap.c:146(dbwrap_lock_order_state_destructor)
  release lock order 1 for /var/lib/samba/smbXsrv_tcon_global.tdb
[2016/06/15 17:01:30.017356, 10, pid=50150, effective(0, 0), real(0, 0)] ../lib/dbwrap/dbwrap.c:133(debug_lock_order)
  lock order:  1:<none> 2:<none> 3:<none>
[2016/06/15 17:01:30.017370,  4, pid=50150, effective(0, 0), real(0, 0), class=vfs] ../source3/smbd/vfs.c:858(vfs_ChDir)
  vfs_ChDir to /path/to/share
[2016/06/15 17:01:30.020742,  4, pid=50150, effective(0, 0), real(0, 0), class=vfs] ../source3/smbd/vfs.c:858(vfs_ChDir)
  vfs_ChDir to /path/to/share
[2016/06/15 17:01:30.020768,  3, pid=50150, effective(0, 0), real(0, 0)] ../source3/smbd/service.c:198(set_current_service)
  chdir (/path/to/share) failed, reason: Permission denied
[2016/06/15 17:01:30.020794,  0, pid=50150, effective(0, 0), real(0, 0)] ../source3/smbd/smbXsrv_tcon.c:974(smbXsrv_tcon_disconnect)
  smbXsrv_tcon_disconnect(0xa03aaa59, 'sharename'): set_current_service() failed: NT_STATUS_INTERNAL_ERROR
[2016/06/15 17:01:30.020877,  0, pid=50150, effective(0, 0), real(0, 0)] ../source3/smbd/smbXsrv_tcon.c:1023(smbXsrv_tcon_disconnect_all)
  smbXsrv_tcon_disconnect_all: count[3] errors[1] first[NT_STATUS_INTERNAL_ERROR]
[2016/06/15 17:01:30.020894,  0, pid=50150, effective(0, 0), real(0, 0)] ../source3/smbd/smbXsrv_session.c:1595(smbXsrv_session_logoff)
  smbXsrv_session_logoff(0x59ea7b40): smb2srv_tcon_disconnect_all() failed: NT_STATUS_INTERNAL_ERROR
[2016/06/15 17:01:30.020918,  0, pid=50150, effective(0, 0), real(0, 0)] ../source3/smbd/smbXsrv_session.c:1645(smbXsrv_session_logoff_all)
  smbXsrv_session_logoff_all: count[1] errors[1] first[NT_STATUS_INTERNAL_ERROR]
[2016/06/15 17:01:30.020934,  0, pid=50150, effective(0, 0), real(0, 0)] ../source3/smbd/server_exit.c:158(exit_server_common)
  Server exit (NT_STATUS_IO_TIMEOUT)
[2016/06/15 17:01:30.020951,  0, pid=50150, effective(0, 0), real(0, 0)] ../source3/smbd/server_exit.c:161(exit_server_common)
  exit_server_common: smbXsrv_session_logoff_all() failed (NT_STATUS_INTERNAL_ERROR) - triggering cleanup
[2016/06/15 17:01:30.021122,  4, pid=50150, effective(0, 0), real(0, 0)] ../source3/smbd/sec_ctx.c:316(set_sec_ctx)
  setting sec ctx (0, 0) - sec_ctx_stack_ndx = 0
[2016/06/15 17:01:30.021161,  5, pid=50150, effective(0, 0), real(0, 0)] ../libcli/security/security_token.c:53(security_token_debug)
  Security token: (NULL)
[2016/06/15 17:01:30.021180,  5, pid=50150, effective(0, 0), real(0, 0)] ../source3/auth/token_util.c:639(debug_unix_user_token)
  UNIX token of user 0
  Primary group is 0 and contains 0 supplementary groups
[2016/06/15 17:01:30.021224,  5, pid=50150, effective(0, 0), real(0, 0)] ../source3/smbd/uid.c:425(smbd_change_to_root_user)
  change_to_root_user: now uid=(0,0) gid=(0,0)
[2016/06/15 17:01:30.021263, 10, pid=50150, effective(0, 0), real(0, 0)] ../source3/smbd/notify_internal.c:181(notify_context_destructor)
  notify_context_destructor called
[2016/06/15 17:01:30.021281,  5, pid=50150, effective(0, 0), real(0, 0)] ../source3/lib/messages.c:431(messaging_deregister)
  Deregistering messaging pointer for type 784 - private_data=0x7fe263667fa0
[2016/06/15 17:01:30.023156,  0, pid=50150, effective(0, 0), real(0, 0)] ../source3/lib/util.c:788(smb_panic_s3)
  PANIC (pid 50150): smbXsrv_session_logoff_all failed
[2016/06/15 17:01:30.024325,  0, pid=50150, effective(0, 0), real(0, 0)] ../source3/lib/util.c:899(log_stack_trace)
  BACKTRACE: 21 stack frames:
   #0 /lib64/libsmbconf.so.0(log_stack_trace+0x1a) [0x7fe260e850ba]
   #1 /lib64/libsmbconf.so.0(smb_panic_s3+0x20) [0x7fe260e85190]
   #2 /lib64/libsamba-util.so.0(smb_panic+0x2f) [0x7fe262d0991f]
   #3 /usr/lib64/samba/libsmbd-base-samba4.so(+0x15fe45) [0x7fe26290fe45]
   #4 /usr/lib64/samba/libsmbd-base-samba4.so(+0x16013e) [0x7fe26291013e]
   #5 /usr/lib64/samba/libsmbd-shim-samba4.so(exit_server_cleanly+0x12) [0x7fe260833ba2]
   #6 /usr/lib64/samba/libsmbd-base-samba4.so(smbd_server_connection_terminate_ex+0x20) [0x7fe2628efb60]
   #7 /usr/lib64/samba/libsmbd-base-samba4.so(+0x143b2e) [0x7fe2628f3b2e]
   #8 /lib64/libsmbconf.so.0(run_events_poll+0x16c) [0x7fe260e9af5c]
   #9 /lib64/libsmbconf.so.0(+0x371b0) [0x7fe260e9b1b0]
   #10 /lib64/libtevent.so.0(_tevent_loop_once+0x8d) [0x7fe25f8b736d]
   #11 /lib64/libtevent.so.0(tevent_common_loop_wait+0x1b) [0x7fe25f8b750b]
   #12 /usr/lib64/samba/libsmbd-base-samba4.so(smbd_process+0x6d9) [0x7fe2628e1429]
   #13 /usr/sbin/smbd(+0xae44) [0x7fe26336de44]
   #14 /lib64/libsmbconf.so.0(run_events_poll+0x16c) [0x7fe260e9af5c]
   #15 /lib64/libsmbconf.so.0(+0x371b0) [0x7fe260e9b1b0]
   #16 /lib64/libtevent.so.0(_tevent_loop_once+0x8d) [0x7fe25f8b736d]
   #17 /lib64/libtevent.so.0(tevent_common_loop_wait+0x1b) [0x7fe25f8b750b]
   #18 /usr/sbin/smbd(main+0x1501) [0x7fe26336a361]
   #19 /lib64/libc.so.6(__libc_start_main+0xf5) [0x7fe25f512b15]
   #20 /usr/sbin/smbd(+0x7765) [0x7fe26336a765]
[2016/06/15 17:01:30.027043,  0, pid=50150, effective(0, 0), real(0, 0)] ../source3/lib/dumpcore.c:318(dump_core)
  dumping core in /var/log/samba/cores/smbd