Bug 8726 - SMB2 protocol crashes smbd inconsistently without any reasons
SMB2 protocol crashes smbd inconsistently without any reasons
Status: REOPENED
Product: Samba 3.6
Classification: Unclassified
Component: SMB2
3.6.2
x64 FreeBSD
: P5 normal
: ---
Assigned To: Jeremy Allison
Samba QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-01-26 22:18 UTC by Volodymyr
Modified: 2013-08-28 00:10 UTC (History)
5 users (show)

See Also:


Attachments
Log file with loglevel=10 and crash in the end. (1.45 MB, application/octet-stream)
2012-01-26 22:18 UTC, Volodymyr
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Volodymyr 2012-01-26 22:18:41 UTC
Created attachment 7258 [details]
Log file with loglevel=10 and crash in the end.

When enabling smb2 protocol, I get inconsistently crashes. Tested on samba 3.6.1 and new 3.6.2. Filesystem is zfs. configuration is this:
[global]
    encrypt passwords = yes
    passdb backend=smbpasswd
    dns proxy = no
    max log size = 1000000
    netbios name = zipper-share
    workgroup = WORKGROUP
    security = user
    create mask = 0666
    create mask = 0666
    directory mask = 0777
    client ntlmv2 auth = yes
    dos charset = CP866
    unix charset = UTF-8
    log level = 1
    log file = /var/log/samba/smbd.%m
    socket options=IPTOS_THROUGHPUT IPTOS_LOWDELAY TCP_NODELAY
    use sendfile = no
    aio read size = 16384
    aio write size = 16384
    load printers = no
    oplocks = yes
    max xmit = 65435
    deadtime = 5
    large readwrite = no
    max protocol = smb2

for testing purposes, i've changed log level to 10 and attached the whole log file to this report. 

client is Windows 7 Ultimate. This happens time to time. Without SMB2 enabled, everything works fine. 

PC: Intel MB, Core2Duo, 4GB ram, raidz1-3*2Tb Seagate(Device Model:ST2000NM0011)
OS: FreeBSD 9.0-RELEASE FreeBSD 9.0-RELEASE #0, default kernel, zfs and aio loaded as modules.
Network: em0@pci0:0:25:0:        class=0x020000 card=0x50028086 chip=0x10cd8086 rev=0x00 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82567LF-2 Gigabit Network Connection'
    class      = network
    subclass   = ethernet
Comment 1 Jeremy Allison 2012-01-26 22:35:06 UTC
Looks like it's going down inside smbd_aio_complete_aio_ex().

I need a good stack backtrace from you. Set:

panic action = /bin/sleep 999999

in the [global] section of your smb.conf and when it does you'll get the crashed smbd as the parent of the sleep process. Attach to it with gdb and type "bt" to get a backtrace - then post that to this bug.

In the meantime for production use try turning off aio by removing the lines:

    aio read size = 16384
    aio write size = 16384

from your smb.conf. I bet that fixes the problem. Also remove:

socket options=IPTOS_THROUGHPUT IPTOS_LOWDELAY TCP_NODELAY
max xmit = 65435
large readwrite = no
encrypt passwords = yes

as these are legacy parameter settings that make no sense on modern Samba.

Jeremy.
Comment 2 Volodymyr 2012-01-30 08:16:26 UTC
I'm not sure that this is what you need. but, anyway:

# ./gdb73.1 -pid 68695
GNU gdb (GDB) 7.3.1 [GDB v7.3.1 for FreeBSD]
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd9.0".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Attaching to process 68695
Reading symbols from /usr/local/sbin/smbd...(no debugging symbols found)...done.
Reading symbols from /libexec/ld-elf.so.1...(no debugging symbols found)...done.
Loaded symbols for /libexec/ld-elf.so.1
0x00000008030f683a in ?? () from /libexec/ld-elf.so.1
(gdb) bt
#0  0x00000008030f683a in ?? () from /libexec/ld-elf.so.1
#1  0x00000008030b88ce in ?? ()
#2  0x0000000000000000 in ?? ()

if I need to recompile samba with debugging, tell me please how can I do this.
Comment 3 Jeremy Allison 2012-01-30 19:24:58 UTC
No, as you guessed that doesn't help, sorry :-). When compiling Samba ensure that the -g flag is passed to the compiler and linker to ensure debug symbols are included.

Jeremy.
Comment 4 Volodymyr 2012-02-03 14:04:44 UTC
After adding -g option, i've got a trouble: 

[2012/01/31 10:35:22.410002,  0] lib/fault.c:51(fault_report)
  ===============================================================
[2012/01/31 10:35:22.436885,  0] lib/fault.c:52(fault_report)
  INTERNAL ERROR: Signal 11 in pid 20732 (3.6.2)
  Please read the Trouble-Shooting section of the Samba3-HOWTO
[2012/01/31 10:35:22.436947,  0] lib/fault.c:54(fault_report)

  From: http://www.samba.org/samba/docs/Samba3-HOWTO.pdf
[2012/01/31 10:35:22.437008,  0] lib/fault.c:55(fault_report)
  ===============================================================
[2012/01/31 10:35:22.437047,  0] lib/util.c:1111(smb_panic)
  smb_panic: clobber_region() last called from [sub_set_smb_name(187)]
[2012/01/31 10:35:22.437087,  0] lib/util.c:1117(smb_panic)
  PANIC (pid 20732): internal error


and panic action cannot be run.
config is 
[global]
    passdb backend=smbpasswd
    dns proxy = no
    max log size = 1000000
    netbios name = zipper-share
    workgroup = WORKGROUP
    security = user
    create mask = 0666
    create mask = 0666
    directory mask = 0777
    client ntlmv2 auth = yes
    dos charset = CP866
    unix charset = UTF-8
    log level = 10
    panic action = /bin/sleep 999999
    log file = /var/log/samba/smbd.%m
    socket options=IPTOS_THROUGHPUT TCP_NODELAY SO_SNDBUF=65435 SO_RCVBUF=65435
    use sendfile = no
    aio read size = 16384
    aio write size = 16384
    load printers = no
    oplocks = yes
    max xmit = 65435
    deadtime = 5
    large readwrite = no
    max protocol = smb2
Comment 5 Volodymyr 2012-02-03 14:05:47 UTC
I don't disable settings that you suggested to finally track this bug.
Comment 6 Adam Grigolato 2012-08-26 15:00:09 UTC
I also have a very similar, if not the same issue,
3.6.7
FreeBSD 9
AMD64, ZFS file system.

with SMB2 enabled, random crashes; especially when transferring large files from
samba to windows box.

I will attempt to get a backtrace tomorrow and upload it.
Comment 7 Jeremy Allison 2012-10-09 00:19:17 UTC
Still waiting for backtrace...
Comment 8 Matthew Trent 2012-10-11 15:27:39 UTC
A "me too" probably isn't super productive here, but my usage of FreeBSD is limited to embedded/appliance stuff which doesn't have debugging tools...

I have also experienced this crash regularly when transferring files when SMB2 and AIO are enabled together on both FreeBSD 8.3 and 9.x based systems.

The issue is noted in the NAS4Free release notes (at the bottom):
http://sourceforge.net/projects/nas4free/files/NAS4Free-9.1.0.1/9.1.0.1.344/

I will work on getting gdb loaded and get a stack trace. Thanks for your instructions on the procedure, Jeremy.
Comment 9 Jeremy Allison 2012-10-11 17:49:13 UTC
If you can get any FreeNAS developer to contact me directly about this I'll work on getting it fixed for them.

FreeNAS is an incredibly important platform for us and I know of no outstanding bugs with *BSD, Samba SMB2 and AIO. Obviously I'm wrong, but they haven't reported it to us.

If you have a FreeNAS contact name that would do also.

Jeremy
Comment 10 Matthew Trent 2012-10-11 18:15:03 UTC
(In reply to comment #9)
> If you can get any FreeNAS developer to contact me directly about this I'll
> work on getting it fixed for them.
> 
> FreeNAS is an incredibly important platform for us and I know of no outstanding
> bugs with *BSD, Samba SMB2 and AIO. Obviously I'm wrong, but they haven't
> reported it to us.
> 
> If you have a FreeNAS contact name that would do also.
> 
> Jeremy

Thanks for your reply, Jeremy. I've posted in the NAS4Free forum referencing this bugzilla entry. I'll do that in the FreeNAS forum as well, since both NAS4Free and FreeNAS exhibit this behavior. It seems fairly universal, but under the radar, as AIO + SMB2 is not the default config.

See also this forum post where this issue is mentioned as affecting FreeNAS:
http://hardforum.com/showpost.php?p=1038365665&postcount=18
Also here, although they didn't make the AIO connection:
http://forums.freenas.org/showthread.php?6960-Copy-from-FreeNAS-via-CIFS-fails&highlight=smb2
Comment 11 Josh Paetzel 2012-10-12 19:26:55 UTC
(In reply to comment #9)
> If you can get any FreeNAS developer to contact me directly about this I'll
> work on getting it fixed for them.
> 
> FreeNAS is an incredibly important platform for us and I know of no outstanding
> bugs with *BSD, Samba SMB2 and AIO. Obviously I'm wrong, but they haven't
> reported it to us.
> 
> If you have a FreeNAS contact name that would do also.
> 
> Jeremy

Heya Jeremy, I'm the lead FreeNAS developer.  What can I do to help out?
Comment 12 Jeremy Allison 2012-10-12 20:05:46 UTC
Thanks for getting back to me.

I'd like you to give me some more info on what is failing in SMB2 when you turn on AIO. Stack backtraces etc.

Also please log bugs directly on this bugzilla for FreeNAS issues.

Jeremy.
Comment 13 Matthew Trent 2012-10-12 20:31:06 UTC
FYI, steps to reproduce on FreeNAS 8.x:

Add "max protocol = smb2" to Samba auxiliary parameters, and enable AIO. Restart service/reconnect client.

Start transferring large (multi-GB) files back and forth with a FreeNAS CIFS share. Within a few seconds or minutes of heavy transfers, you'll have it abort, smbd exits on signal 6, and errors in the smbd logs. Crash seems to happen more (always?) when reading files from the server, rather than writing.

I'm using Windows 2008 R2 or Windows 7 as client, and a Dell Xeon E5310 1.6 with 12GB and FreeNAS 8.x 64-bit for server. Backing filesystem is ZFS.
Comment 14 Jeremy Allison 2012-10-12 20:34:42 UTC
Can you get me a stack backtrace with symbols and post it here ?

Easiest way is to set:

panic action = /bin/sleep 999999

in the [global] section of the smb.conf, reproduce the crash and then attach to the parent of the sleep process with gdb, then get the backtrace.

Cheers,

Jeremy.
Comment 15 Matthew Trent 2012-10-15 22:30:22 UTC
I've installed a full FreeBSD system and got Samba 3.6.7 compiled and running from the ports tree with the MAX_DEBUG option.

I can reproduce the problem, but although panic action = "/bin/sleep 999999" is definitely set in smb.conf, the "panic action" doesn't seem to run. So the process goes away and I have nothing to which to attach GDB.

One thing I have noticed, while watching 'ps' during a file transfer, is that just before it crashes, a number of [aiodXX] (where XX is different numbers) processes pop up. Before the crash, there are about 5 of the [aiodXX] processes. Then it jumps up to about 30 [aiodXX] processes, and crashes.

Any ideas why "panic action" isn't working? (The /bin/sleep binary exists and works fine.)
Comment 16 Jeremy Allison 2012-10-16 09:14:39 UTC
Ok, that's actually a more serious bug than the AIO problem :-).

We depend on panic action working to get any useful backtraces.

Look inside source3/lib/fault.c and you'll find:

/**
setup our fault handlers
**/
_PUBLIC_ void fault_setup(const char *pname)
{
        if (progname != NULL) {
                return;
        }
        progname = pname;
#ifdef SIGSEGV
        CatchSignal(SIGSEGV, sig_fault);
#endif
#ifdef SIGBUS
        CatchSignal(SIGBUS, sig_fault);
#endif
#ifdef SIGABRT
        CatchSignal(SIGABRT, sig_fault);
#endif
#ifdef SIGFPE
        CatchSignal(SIGFPE, sig_fault);
#endif
}

Is there the equivalent of a /proc entry you can look at to ensure the signal handlers have been installed correctly on the smbd process. Can you try deliberately introducing a SIGSEGV or other fault (maybe using the kill command) to determine why it's not correctly calling sig_fault on receiving a signal ?

Jeremy.
Comment 17 Matthew Trent 2012-10-17 23:14:52 UTC
Well the "panic action" not running culprit was using the FreeBSD port's MAX_DEBUG config option. FYI, this sets the following options, at least one of which breaks panic action execution:
.if defined(WITH_MAX_DEBUG)
CPPFLAGS+=              -g
LDFLAGS+=               -g
LIB_DEPENDS+=           dmalloc.1:${PORTSDIR}/devel/dmalloc
CONFIGURE_ARGS+=        --enable-debug \
                        --enable-socket-wrapper --enable-nss-wrapper \
                        --enable-developer --enable-krb5developer \
                        --enable-dmalloc --with-profiling-data

CONFIGURE_ARGS+=        --with-smbtorture4-path=${WRKDIR}/${DISTNAME}/source4/torture


Anyway, I recompiled with just CPPFLAGS+=-g, LDFLAGS+=-g, and --enable-debug. 'file' shows the resulting smbd binary is NOT stripped.

However, I still don't get useful results from gdb. I'm stumped.

[root@pcbsd] /usr/ports/net/samba36# pstree | grep -B1 "sleep 9000"
 |         \-+- 63283 mstrent /usr/local/sbin/smbd -D -s /usr/local/etc/smb.conf
 |           \--- 63370 mstrent /bin/sleep 90000
[root@pcbsd] /usr/ports/net/samba36# gdb --pid=63283 smbd
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...
Attaching to program: /usr/local/sbin/smbd, process 63283
0x0000000803813a5a in ?? ()
(gdb) bt
#0  0x0000000803813a5a in ?? ()
#1  0x00000008037d144e in ?? ()
#2  0x0000000000000000 in ?? ()
#3  0x0000000000000000 in ?? ()
#4  0x0000000000000000 in ?? ()
#5  0xffffffff00000000 in ?? ()
#6  0x0000000000000000 in ?? ()
#7  0x0000000000000000 in ?? ()
#8  0x0000000000000000 in ?? ()
#9  0xffffffff00000000 in ?? ()
#10 0x0000000000000001 in ?? ()
#11 0x0000000000000000 in ?? ()
#12 0x0000000000000000 in ?? ()
#13 0x0000000800000000 in ?? ()
#14 0x0000000040001480 in ?? ()
#15 0x0000000000000000 in ?? ()
#16 0x0000000000080000 in .dynstr ()
Error accessing memory address 0x51000: Bad address.
(gdb)
Comment 18 Matthew Trent 2012-10-17 23:18:38 UTC
BTW I think Samba bug #8772 may be a duplicate of this one:
https://bugzilla.samba.org/show_bug.cgi?id=8772
Comment 19 Ola Karlsson 2012-10-23 17:55:09 UTC
I have the same problem with FreeBSD 9.0-RELEASE and samba36 with AIO support.


Samba logfile,

[2012/10/23 19:37:38.246422,  0] lib/fault.c:55(fault_report)
  ===============================================================
[2012/10/23 19:37:38.246431,  0] lib/util.c:1117(smb_panic)
  PANIC (pid 6704): internal error
[2012/10/23 19:37:38.246479,  0] lib/util.c:1221(log_stack_trace)
  BACKTRACE: 0 stack frames:
[2012/10/23 19:37:38.246496,  0] lib/util.c:1122(smb_panic)
  smb_panic(): calling panic action [/bin/sleep 999999]


gdb --pid=6704 smbd genrates the following,


GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...(no debugging symbols found)...
Attaching to program: /usr/local/sbin/smbd, process 6704
0x000000080511183a in ?? ()
(gdb) bt
#0  0x000000080511183a in ?? ()
#1  0x00000008050d38ce in ?? ()
#2  0x0000000000000000 in ?? ()
#3  0x0000000000000000 in ?? ()
#4  0x0000000000000000 in ?? ()
#5  0xffffffff00000000 in ?? ()
#6  0x0000000000000000 in ?? ()
#7  0x0000000000000000 in ?? ()
#8  0x0000000000000000 in ?? ()
#9  0x0000000000000000 in ?? ()
#10 0x0000000000000001 in ?? ()
#11 0x0000000000000000 in ?? ()
#12 0x0000000000000000 in ?? ()
#13 0x0000000800000000 in ?? ()
#14 0x0000000040001480 in ?? ()
#15 0x0000000000000000 in ?? ()
#16 0x0000000000080000 in .dynstr ()
Error accessing memory address 0x53d98: Bad address.
(gdb)
Comment 20 Ola Karlsson 2012-10-23 17:59:22 UTC
Forgot to add that my options are,

max protocol = smb2
aio write size = 16384
aio read size = 16384
Comment 21 nteruptedservice 2013-04-07 20:47:05 UTC
I am currently experiencing this same issue on FreeBSD 9.1 x64 with Samba 3.6.13

My filesystem is ZFS, here is the global section of my smb.conf.  This samba server uses Active Directory authentication/access.

[global]
workgroup = ***
interfaces = vmx3f0
bind interfaces only = Yes
realm = ***.***.COM
preferred master = no
security = ADS
#password server = ad-pdc.***.***.com
allow trusted domains = No
map to guest = Bad User
encrypt passwords = yes
map untrusted to domain = Yes
log level = 3
log file = /var/log/samba/%m
max log size = 50
winbind enum users = Yes
winbind enum groups = Yes
winbind use default domain = Yes
winbind nested groups = Yes
idmap uid = 600-20000
idmap gid = 600-20000
template shell = /bin/bash
hide dot files = Yes
wide links = No
unix extensions = No
client signing = Yes
socket options = SO_RCVBUF=131072 SO_SNDBUF=131072 TCP_NODELAY SO_KEEPALIVE
local master = no
domain master = no
dns proxy = no
map acl inherit = yes
case sensitive = No
use sendfile = yes
min receivefile size = 16384
aio write size = 65536
aio read size = 65536
write cache size = 65536
level2 oplocks = Yes
oplocks = Yes
Comment 22 nteruptedservice 2013-04-07 21:34:18 UTC
(In reply to comment #21)
> I am currently experiencing this same issue on FreeBSD 9.1 x64 with Samba
> 3.6.13
> 
> My filesystem is ZFS, here is the global section of my smb.conf.  This samba
> server uses Active Directory authentication/access.
> 
> [global]
> workgroup = ***
> interfaces = vmx3f0
> bind interfaces only = Yes
> realm = ***.***.COM
> preferred master = no
> security = ADS
> #password server = ad-pdc.***.***.com
> allow trusted domains = No
> map to guest = Bad User
> encrypt passwords = yes
> map untrusted to domain = Yes
> log level = 3
> log file = /var/log/samba/%m
> max log size = 50
> winbind enum users = Yes
> winbind enum groups = Yes
> winbind use default domain = Yes
> winbind nested groups = Yes
> idmap uid = 600-20000
> idmap gid = 600-20000
> template shell = /bin/bash
> hide dot files = Yes
> wide links = No
> unix extensions = No
> client signing = Yes
> socket options = SO_RCVBUF=131072 SO_SNDBUF=131072 TCP_NODELAY SO_KEEPALIVE
> local master = no
> domain master = no
> dns proxy = no
> map acl inherit = yes
> case sensitive = No
> use sendfile = yes
> min receivefile size = 16384
> aio write size = 65536
> aio read size = 65536
> write cache size = 65536
> level2 oplocks = Yes
> oplocks = Yes

Oops, accidentally cut off the last line
max protocol = smb2
Comment 23 Matthew Trent 2013-06-13 22:25:51 UTC
I can't seem to reproduce this crash as of FreeBSD 9.1/Samba 3.6.13 (FreeNAS 9.1 nightlies). Can anyone confirm?
Comment 24 Matthew Trent 2013-06-18 21:13:33 UTC
The NAS4Free folks are also reporting this bug seems fixed as of Samba 3.6.12.

http://forums.nas4free.org/viewtopic.php?f=78&t=2031&start=200#p22497
Comment 25 nteruptedservice 2013-06-18 21:16:23 UTC
I will say, when I added my comment in April, I did have Samba 3.6.13 and did the problem.  I have not seen the crash happen though in about the past month, I haven't really updated anything of note on my system in that time though.
Comment 26 Matthew Trent 2013-06-18 21:21:51 UTC
I actually did have some crashes with 3.6.13 when I had the "sendfile" option enabled. I disabled that and it's been 100% since with AIO+SMB2. This was not the case with pre-3.6.12 versions.
Comment 27 Dan 2013-07-02 02:39:44 UTC
I am experiencing smbd crash + kernel oops, with recent hardware upgrade to 1Gbpe. Transferring 1.5g of data off from windows7 can sustain slightly over 800Mbps+. Most of the time, 1st file transfer will complete without crashing SMB + kernel, but 2nd transfer will crash every time.

I am running Fedora18 x64 with E6300. Dunno its really a SMB issue or a bigger x64 networkcard driver issue. Will attempt ftp later this week and report back.

any additional info I should provide to help bugfix?
Comment 28 Dan 2013-07-02 12:57:47 UTC
 Jul  3 04:54:08 localhost smbd[10867]: [2013/07/03 04:54:08.310810,  0] ../lib/util/fault.c:72(fault_report)
Jul  3 04:54:08 localhost smbd[10867]:   ===============================================================
Jul  3 04:54:08 localhost smbd[10867]: [2013/07/03 04:54:08.318403,  0] ../lib/util/fault.c:73(fault_report)
Jul  3 04:54:08 localhost kernel: [66036.934859] abrt-hook-ccpp[10868]: segfault at 28 ip 00000034a5210600 sp 

00007fff6236b800 error 4 in ld-2.16.so[34a5200000+20000]
Jul  3 04:54:08 localhost kernel: [66036.934900] Process 10868(abrt-hook-ccpp) has RLIMIT_CORE set to 1
Jul  3 04:54:08 localhost kernel: [66036.934903] Aborting core
Jul  3 04:54:08 localhost kernel: [66036.936491] traps: smbd[1495] trap invalid opcode ip:34a5e0ca71 

sp:7fff84d59f68 error:0 in libpthread-2.16.so[34a5e00000+16000]
Jul  3 04:54:08 localhost systemd[1]: smb.service: main process exited, code=dumped, status=4/ILL
Jul  3 04:54:08 localhost systemd[1]: Unit smb.service entered failed state.
Jul  3 04:55:10 localhost kernel: [66099.006885] ksmtuned[10881]: segfault at 8130116b ip 0000000000435610 sp 

00007fff92e62810 error 6 in bash[400000+da000]
Jul  3 04:55:10 localhost kernel: [66099.007182] Core dump to |/usr/libexec/abrt-hook-ccpp 11 0 10881 0 0 

1372798510 e pipe failed
Comment 29 Jeremy Allison 2013-07-02 16:02:59 UTC
If it's a kernel oops, then it's not a Samba problem but a Linux one - by definition.

Sorry.

Jeremy.
Comment 30 Timur Bakeyev 2013-07-02 16:07:04 UTC
(In reply to comment #29)
> If it's a kernel oops, then it's not a Samba problem but a Linux one - by
> definition.

First, I don't like that someone piggy backed Linux kernel problem on a FreeBSD-specific bug.

Second, I don't like that it was closed due that reason, as the original problem wasn't addressed or it was proved that it has the same roots.
Comment 31 Jeremy Allison 2013-07-02 16:09:44 UTC
Oh sorry Timur. I only read the last comment about the kernel oops and assumed that was the whole issue.

You know I don't close FreeBSD Samba bugs arbitrarily :-).

Re-opening..

Jeremy.
Comment 32 Timur Bakeyev 2013-07-02 16:15:59 UTC
(In reply to comment #31)
> Oh sorry Timur. I only read the last comment about the kernel oops and assumed
> that was the whole issue.
> 
> You know I don't close FreeBSD Samba bugs arbitrarily :-).
> 
> Re-opening..

Thanks, Jeremy. I really like this to be pinpointed somehow, according your request ports version of the samba36 has following flags during build:

--exec-prefix="/usr/local"  --sysconfdir="/usr/local/etc"  --with-configdir="/usr/local/etc"  --includedir="/usr/local/include/samba"  --datadir="/usr/local/share/samba36"  --with-swatdir="/usr/local/share/swat"  --libdir="/usr/local/lib"  --with-pammodulesdir="/usr/local/lib"  --with-modulesdir="/usr/local/lib/samba"  --localstatedir="/var"  --with-piddir="/var/run/samba"  --with-ncalrpcdir="/var/run/samba/ncalrpc"  --with-nmbdsocketdir="/var/run/samba/nmbd"  --with-lockdir="/var/db/samba"  --with-statedir="/var/db/samba"  --with-cachedir="/var/db/samba"  --with-privatedir="/usr/local/etc/samba"  --with-logfilebase="/var/log/samba" --without-libtdb --enable-external-libtdb --without-libtalloc --enable-external-libtalloc --without-libtevent --enable-external-libtevent --with-libiconv="/usr/local"  --disable-as-needed --with-pam --with-readline=/usr  --with-included-iniparser  --with-sendfile-support  --enable-largefile  --without-cluster-support  --without-libsmbclient  --without-libaddns  --without-libnetapi  --without-libsmbsharemodes --enable-cups --enable-iprint --enable-debug --with-smbtorture4-path=/usr/ports/net/samba36/work/samba-3.6.16/source4/torture --with-syslog --with-quotas --with-utmp --with-winbind --enable-swat --enable-fam --with-acl-support --with-aio-support --with-pam_smbpass --with-dnsupdate --enable-avahi --enable-pthreadpool --without-included-popt --with-ads --with-krb5="/usr" --with-ldap --with-shared-modules="idmap_tdb2,idmap_ad,idmap_adex,idmap_hash,idmap_rid,charset_weird,vfs_cacheprime,vfs_catia,vfs_commit,vfs_dirsort,vfs_readahead,vfs_streams_depot,vfs_syncops,vfs_notify_fam,vfs_zfsacl" --prefix=/usr/local

Note --with-aio-support and --enable-pthreadpool. But that was a case already for 3.6.13(and 16 now). So not sure, did it improve the situation or not.
Comment 33 Dan 2013-07-04 16:45:13 UTC
(In reply to comment #29)
> If it's a kernel oops, then it's not a Samba problem but a Linux one - by
> definition.
> 

Yup, indeed, ftp-ing same large file exhibits same kernal-oops, most likely due to RTL 1Gbpe chipset driver issue.

Sorry about posting this on freeBSD group, was skimping thru very quickly. This bug, ranked #1 on google with my combination of keywords. Mum would be so proud.

cheers ~
Comment 34 Matthew Trent 2013-08-27 18:11:50 UTC
FYI the SMB2 + AIO crash is manifesting again with recent versions of FreeNAS (9.1.1+). So this issue is still present with recent versions of FreeBSD (9.1-STABLE) and Samba (3.6.17).

FreeNAS folks have elected to just remove/disable the AIO option:
https://bugs.freenas.org/issues/3079#change-8030
https://bugs.freenas.org/issues/3071

Suprisingly, at least with my current setup, performance is actually better without AIO. So disabling AIO may be an acceptable option for anyone affected by this.
Comment 35 Volker Lendecke 2013-08-27 19:14:26 UTC
(In reply to comment #34)
> FYI the SMB2 + AIO crash is manifesting again with recent versions of FreeNAS
> (9.1.1+). So this issue is still present with recent versions of FreeBSD
> (9.1-STABLE) and Samba (3.6.17).
> 
> FreeNAS folks have elected to just remove/disable the AIO option:
> https://bugs.freenas.org/issues/3079#change-8030
> https://bugs.freenas.org/issues/3071
> 
> Suprisingly, at least with my current setup, performance is actually better
> without AIO. So disabling AIO may be an acceptable option for anyone affected
> by this.

If possible, it would be great if people could test aio with Samba 4.0. There by default we don't depend on kernel aio anymore but instead do it with pthreads. This will at least change the picture.
Comment 36 Jeremy Allison 2013-08-28 00:10:12 UTC
Actually performance can be better without AIO depending on what the client is capable of (the more parallelization, the faster things go). So this isn't so surprising.

Jeremy.