Bug 11876 - Dumping core when a user attempts to rm or mv a file
Summary: Dumping core when a user attempts to rm or mv a file
Status: RESOLVED WONTFIX
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services (show other bugs)
Version: 4.3.8
Hardware: Other Linux
: P5 critical (vote)
Target Milestone: ---
Assignee: Samba QA Contact
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-04-26 18:09 UTC by rb
Modified: 2016-05-10 19:30 UTC (History)
2 users (show)

See Also:


Attachments
Samba core dump syslog (3.04 KB, text/plain)
2016-04-28 00:42 UTC, rb
no flags Details
Core dump (680.33 KB, application/gzip)
2016-04-29 14:02 UTC, rb
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description rb 2016-04-26 18:09:25 UTC
Hardware: armv7
OS: Ubuntu 14.04
Kernel: Linux southport-nas 3.10.69 #1 SMP PREEMPT Thu Feb 12 15:22:14 BRST 2015 armv7l armv7l armv7l GNU/Linux
Version: Samba version 4.3.8-Ubuntu

Users can access, read, and write from shares without problem.  Any attempt to rm or mv a file cause an smbd core dump:

Apr 26 12:08:35 southport-nas smbd[17375]: [2016/04/26 12:08:35.414229,  0] ../source3/lib/popt_common.c:68(popt_s3_talloc_log_fn)
Apr 26 12:08:35 southport-nas smbd[17375]:   talloc: access after free error - first free may be at ../source3/smbd/open.c:3248
Apr 26 12:08:35 southport-nas smbd[17375]: [2016/04/26 12:08:35.414541,  0] ../source3/lib/popt_common.c:68(popt_s3_talloc_log_fn)
Apr 26 12:08:35 southport-nas smbd[17375]:   Bad talloc magic value - access after free
Apr 26 12:08:35 southport-nas smbd[17375]: [2016/04/26 12:08:35.414722,  0] ../source3/lib/util.c:789(smb_panic_s3)
Apr 26 12:08:35 southport-nas smbd[17375]:   PANIC (pid 17375): Bad talloc magic value - access after free
Apr 26 12:08:35 southport-nas smbd[17375]: [2016/04/26 12:08:35.415561,  0] ../source3/lib/util.c:900(log_stack_trace)
Apr 26 12:08:35 southport-nas smbd[17375]:   BACKTRACE: 0 stack frames:
Apr 26 12:08:35 southport-nas smbd[17375]: [2016/04/26 12:08:35.416061,  0] ../source3/lib/util.c:801(smb_panic_s3)
Apr 26 12:08:35 southport-nas smbd[17375]:   smb_panic(): calling panic action [/usr/share/samba/panic-action 17375]
Apr 26 12:08:35 southport-nas smbd[17375]: [2016/04/26 12:08:35.430465,  0] ../source3/lib/util.c:809(smb_panic_s3)
Apr 26 12:08:35 southport-nas smbd[17375]:   smb_panic(): action returned status 0
Apr 26 12:08:35 southport-nas smbd[17375]: [2016/04/26 12:08:35.430917,  0] ../source3/lib/dumpcore.c:318(dump_core)
Apr 26 12:08:35 southport-nas smbd[17375]:   dumping core in /var/log/samba/cores/smbd
Apr 26 12:08:35 southport-nas smbd[17375]:
Comment 1 Christian Ambach 2016-04-28 00:07:15 UTC
Can you post the complete stack trace from the logs please? 
And how are the clients accessing the server?
Comment 2 rb 2016-04-28 00:41:25 UTC
(In reply to Christian Ambach from comment #1)

The client is connected through an x86_64 Ubuntu 14.04.4 workstation running Samba 4.1.6-Ubuntu.

Copy from syslog is attached.

1) I tailed syslog on the arm server while mounting the share on the client.
2) I then cd to the mount point and run "touch test".  The first core dump at 18:31:17 was actually from that command, which I didn't realize before because the file is successfully created.
3) I then try to "rm test".  The client hangs and the file is not removed.  That is the second core dump at 18:31:22

NOTE: I built Samba 4.3.8 from source directly on the armh machine, but I haven't had a chance to install and attach gdb yet.
Comment 3 rb 2016-04-28 00:42:11 UTC
Created attachment 12032 [details]
Samba core dump syslog
Comment 4 rb 2016-04-28 14:49:16 UTC
I built Samba on the arm box from this tag: https://github.com/samba-team/samba/releases/tag/samba-4.3.8

I used this configure line:

./configure --enable-debug --enable-developer --abi-check-disable --with-configdir=/etc/samba --enable-fhs  --prefix=/usr --sysconfdir=/etc --localstatedir=/var

make
sudo make install

I manually:

* Put the upstart scripts from the deb package in /etc/init
* Created smb.conf and smbusers in /etc/samba
* Used smbuser to create the users

Used the same client to mount the share, ran the "touch" and "remove" commands. It works perfectly, no crashes.  I was expecting to recreate the issue for debugging, but I can't recreate it with the binaries that I built.
Comment 5 Christian Ambach 2016-04-29 09:37:52 UTC
So the problem might be located in Ubuntu specific patches or the way how Ubuntu builds the packages. Unfortunately, the debug info that's needed for a complete stack trace in the logs seems to be missing.

Can you try to go with the original Ubuntu binaries and install the debug info packages so smbd is able to log a complete stack trace (and produce a core file that can be examined)?
Comment 6 rb 2016-04-29 14:02:51 UTC
Created attachment 12047 [details]
Core dump
Comment 7 rb 2016-04-29 14:04:24 UTC
(In reply to Christian Ambach from comment #5)
Didn't see any more stack info in syslog, am I missing something.

/var/log/samba/log. looks the same:

[2016/04/29 07:56:39.854939,  0] ../source3/lib/popt_common.c:68(popt_s3_talloc_log_fn)
  talloc: access after free error - first free may be at ../source3/smbd/open.c:3248
[2016/04/29 07:56:39.855281,  0] ../source3/lib/popt_common.c:68(popt_s3_talloc_log_fn)
  Bad talloc magic value - access after free
[2016/04/29 07:56:39.855466,  0] ../source3/lib/util.c:789(smb_panic_s3)
  PANIC (pid 9155): Bad talloc magic value - access after free
[2016/04/29 07:56:39.856688,  0] ../source3/lib/util.c:900(log_stack_trace)
  BACKTRACE: 0 stack frames:
[2016/04/29 07:56:39.857176,  0] ../source3/lib/util.c:801(smb_panic_s3)
  smb_panic(): calling panic action [/usr/share/samba/panic-action 9155]
[2016/04/29 07:56:39.871776,  0] ../source3/lib/util.c:809(smb_panic_s3)
  smb_panic(): action returned status 0
[2016/04/29 07:56:39.872235,  0] ../source3/lib/dumpcore.c:318(dump_core)
  dumping core in /var/log/samba/cores/smbd
[2016/04/29 07:56:57.569913,  0] ../source3/lib/popt_common.c:68(popt_s3_talloc_log_fn)
  talloc: access after free error - first free may be at ../source3/smbd/open.c:3248
[2016/04/29 07:56:57.570231,  0] ../source3/lib/popt_common.c:68(popt_s3_talloc_log_fn)
  Bad talloc magic value - access after free
[2016/04/29 07:56:57.570411,  0] ../source3/lib/util.c:789(smb_panic_s3)
  PANIC (pid 9176): Bad talloc magic value - access after free
[2016/04/29 07:56:57.571259,  0] ../source3/lib/util.c:900(log_stack_trace)
  BACKTRACE: 0 stack frames:
[2016/04/29 07:56:57.571652,  0] ../source3/lib/util.c:801(smb_panic_s3)
  smb_panic(): calling panic action [/usr/share/samba/panic-action 9176]
[2016/04/29 07:56:57.585606,  0] ../source3/lib/util.c:809(smb_panic_s3)
  smb_panic(): action returned status 0
[2016/04/29 07:56:57.586064,  0] ../source3/lib/dumpcore.c:318(dump_core)
  dumping core in /var/log/samba/cores/smbd


Attached a core dump.  Don't mean to be thick.  Can you give me explicit instructions if you want something else?  I've never hacked on Samba before.
Comment 8 Christian Ambach 2016-04-29 14:07:05 UTC
The core dump file probably only be readable on your system.
Did you install the debug info packages of Ubuntu?

Can you attach the output of the following?
gdb /usr/sbin/smbd <corefile>
and inside issue the command bt full

This should result in a longish list of programming infos that we'll need.
Comment 9 rb 2016-04-29 14:12:16 UTC
I was already doing so when I got your reply, so I feel vindicated ;-)

root@southport-nas:/var/log/samba# gdb /usr/sbin/smbd cores/smbd/core 
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/sbin/smbd...Reading symbols from /usr/lib/debug/.build-id/18/9a04415c1c43ea49856e41582d20047e443974.debug...done.
done.
[New LWP 9176]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Core was generated by `smbd -F'.
Program terminated with signal SIGABRT, Aborted.
#0  __libc_do_syscall ()
    at ../ports/sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:44
44	../ports/sysdeps/unix/sysv/linux/arm/libc-do-syscall.S: No such file or directory.
(gdb) 
(gdb) bt full 
#0  __libc_do_syscall ()
    at ../ports/sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:44
No locals.
#1  0xb65eaf0e in __GI_raise (sig=sig@entry=6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
        _a1 = 0
        _a3tmp = 6
        _a1tmp = 0
        _a3 = 6
        _nametmp = 268
        _a2tmp = 9176
        _a2 = 9176
        _name = 268
        _sys_result = <optimized out>
        pd = 0xb571c000
        pid = 0
        selftid = 9176
#2  0xb65ed766 in __GI_abort () at abort.c:89
        save_stage = 2
        act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, 
          sa_mask = {__val = {3187684912, 3199381856, 3199381752, 3289347840, 
              3044132048, 4, 4, 3199381728, 3066142932, 3066103169, 29, 1, 
              3060432896, 3060432896, 6, 1, 0, 0, 0, 120, 0, 57, 91, 110, 
---Type <return> to continue, or q <return> to quit---
              119, 124, 3060433456, 0, 3060432896, 1, 335544320, 
              3059661553}}, sa_flags = -1231632376, 
          sa_restorer = 0xb696d444 <corepath>}
        sigs = {__val = {32, 0 <repeats 31 times>}}
#3  0xb6949f68 in dump_core () at ../source3/lib/dumpcore.c:337
        called = true
        __FUNCTION__ = "dump_core"
#4  0xb6be31da in smb_panic_s3 (why=<optimized out>)
    at ../source3/lib/util.c:812
        cmd = 0xb571c000 ""
        result = <optimized out>
        __FUNCTION__ = "smb_panic_s3"
#5  0xb6e80236 in smb_panic (why=<optimized out>) at ../lib/util/fault.c:166
No locals.
#6  0xb66be518 in ?? () from /usr/lib/arm-linux-gnueabihf/libtalloc.so.2
No symbol table info available.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

The hardware is an odroid-xu4.  A short Google search shows a lot of the same  error (libc-do-syscall.S) with various binaries on armv7.  Still researching, as some patches on other code seem to have worked around the issue.
Comment 10 Christian Ambach 2016-05-01 12:35:30 UTC
(In reply to rb from comment #9)
>#6  0xb66be518 in ?? () from /usr/lib/arm-linux-gnueabihf/libtalloc.so.2
>No symbol table info available.
>Backtrace stopped: previous frame identical to this frame (corrupt stack?)

You probably also need to install the debug infos for libtalloc.
Until then, the important parts of the stacktrace are still missing I am afraid.
Comment 11 rb 2016-05-01 20:33:09 UTC
Well, it seems that the bug must have been fixed between libtalloc2 2.1.0 and 2.1.5.

Here's what was installed:

root@southport-nas:~$ dpkg -l | grep libtalloc
ii  libtalloc2:armhf                      2.1.0-1                                              armhf        hierarchical pool based memory allocator

Then I went to install libtalloc2-dbg and it also upgraded libtalloc2 to 2.1.5 (when was this released?)

root@southport-nas:~$ apt-get install libtalloc2-dbg
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following packages were automatically installed and are no longer required:
  libuser1 python-libuser
Use 'apt-get autoremove' to remove them.
The following extra packages will be installed:
  libtalloc2
The following NEW packages will be installed:
  libtalloc2-dbg
The following packages will be upgraded:
  libtalloc2
1 upgraded, 1 newly installed, 0 to remove and 197 not upgraded.
Need to get 87.6 kB of archives.
After this operation, 275 kB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:1 http://ports.ubuntu.com/ubuntu-ports/ trusty-updates/main libtalloc2 armhf 2.1.5-0ubuntu0.14.04.1 [26.8 kB]
Get:2 http://ports.ubuntu.com/ubuntu-ports/ trusty-updates/main libtalloc2-dbg armhf 2.1.5-0ubuntu0.14.04.1 [60.7 kB]
Fetched 87.6 kB in 1s (75.7 kB/s)         
(Reading database ... 172018 files and directories currently installed.)
Preparing to unpack .../libtalloc2_2.1.5-0ubuntu0.14.04.1_armhf.deb ...
Unpacking libtalloc2:armhf (2.1.5-0ubuntu0.14.04.1) over (2.1.0-1) ...
Selecting previously unselected package libtalloc2-dbg.
Preparing to unpack .../libtalloc2-dbg_2.1.5-0ubuntu0.14.04.1_armhf.deb ...
Unpacking libtalloc2-dbg (2.1.5-0ubuntu0.14.04.1) ...
Setting up libtalloc2:armhf (2.1.5-0ubuntu0.14.04.1) ...
Setting up libtalloc2-dbg (2.1.5-0ubuntu0.14.04.1) ...
Processing triggers for libc-bin (2.19-0ubuntu6.6) ...

Restarted smbd, remounted from the client, did a "touch" and a "rm" and no crash.

If you want to verify, I would have to revert libtalloc2 and libtalloc2-dbg v 2.1.0.

Let me know and I'll try to find the armh Deb packages.

If not, then somehow users that run across this bug would somehow have to be alerted to update libtalloc.
Comment 12 Christian Ambach 2016-05-10 16:24:14 UTC
I don't know if its worth finding out which particular change in libtalloc fixes the core dumping that you have seen.
You should report this to your distro so that they increase the version of libtallsc that their Samba packages depend on in order to spare other users from the same problems. I think from our side, this bug can be regarded as already fixed.
Comment 13 rb 2016-05-10 19:30:08 UTC
Agreed.  Thanks.

----- Original Message -----
From: samba-bugs@samba.org
To: rb@oaktreepeak.com
Sent: Tuesday, May 10, 2016 12:24:14 PM
Subject: [Bug 11876] Dumping core when a user attempts to rm or mv a file

https://bugzilla.samba.org/show_bug.cgi?id=11876

Christian Ambach <ambi@samba.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |WONTFIX

--- Comment #12 from Christian Ambach <ambi@samba.org> ---
I don't know if its worth finding out which particular change in libtalloc
fixes the core dumping that you have seen.
You should report this to your distro so that they increase the version of
libtallsc that their Samba packages depend on in order to spare other users
from the same problems. I think from our side, this bug can be regarded as
already fixed.