Bug 9776 - core dump when renaming folder on windows 7 only happens over VPN
core dump when renaming folder on windows 7 only happens over VPN
Status: NEW
Product: Samba 4.0
Classification: Unclassified
Component: File services
4.0.4
x64 Linux
: P5 critical
: ---
Assigned To: Jeremy Allison
Samba QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-04-09 18:34 UTC by sanved
Modified: 2013-07-15 13:10 UTC (History)
2 users (show)

See Also:


Attachments
contains core and samba config (smb.conf.defaults is used for windows 7) (483.33 KB, application/x-gzip)
2013-04-09 18:34 UTC, sanved
no flags Details
vfs module src (67.55 KB, text/plain)
2013-04-14 22:03 UTC, sanved
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description sanved 2013-04-09 18:34:14 UTC
Created attachment 8739 [details]
contains core and samba config (smb.conf.defaults is used for windows 7)

When using a windows 7 client to create and rename a folder there is a core dump
The core dump is attached.

This happens consistently over a VPN connection but does not happen when client is used locally over LAN.


Steps to reproduce

1. create a folder on a samba share mounted on Windows 7
2. Rename newly created

smbd will dump core.

If i change the max protocol on samba server to NT1  it works fine.
If i set max protocol to SMB2 or SMB3 it dumps core.

This issue happens only with SMB2/SMB3 but since i use Windows 7 it negotiates
SMB2
Comment 1 Volker Lendecke 2013-04-09 18:43:02 UTC
Can you do a

gdb /usr/sbin/smbd <corefile>

and then do a

bt full

at the prompt? The corefile is hard to use without the very same environment you have.
Comment 2 sanved 2013-04-09 20:29:04 UTC
warning: Can't read pathname for load map: Input/output error.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/sbin/smbd'.
Program terminated with signal 6, Aborted.
#0  0x00007f5055ca8475 in raise () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt full
#0  0x00007f5055ca8475 in raise () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#1  0x00007f5055cab6f0 in abort () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#2  0x00007f505762b3db in dump_core () from /usr/lib/x86_64-linux-gnu/libsmbconf.so.0
No symbol table info available.
#3  0x00007f505761ca1f in smb_panic_s3 () from /usr/lib/x86_64-linux-gnu/libsmbconf.so.0
No symbol table info available.
#4  0x00007f5058b4978f in smb_panic () from /usr/lib/x86_64-linux-gnu/libsamba-util.so.0
No symbol table info available.
#5  0x00007f5058b49986 in ?? () from /usr/lib/x86_64-linux-gnu/libsamba-util.so.0
No symbol table info available.
#6  <signal handler called>
No symbol table info available.
#7  0x00007f5055cf6e21 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#8  0x00007f50522b3256 in push_ascii () from /usr/lib/x86_64-linux-gnu/samba/libCHARSET3.so
No symbol table info available.
#9  0x00007f5058718c96 in ?? () from /usr/lib/x86_64-linux-gnu/samba/libsmbd_base.so
No symbol table info available.
#10 0x00007f505871a9d6 in ?? () from /usr/lib/x86_64-linux-gnu/samba/libsmbd_base.so
No symbol table info available.
#11 0x00007f505871d9eb in smbd_dirptr_lanman2_entry () from /usr/lib/x86_64-linux-gnu/samba/libsmbd_base.so
No symbol table info available.
#12 0x00007f50587758a5 in smbd_smb2_request_process_find () from /usr/lib/x86_64-linux-gnu/samba/libsmbd_base.so
No symbol table info available.
#13 0x00007f505876697d in smbd_smb2_request_dispatch () from /usr/lib/x86_64-linux-gnu/samba/libsmbd_base.so
No symbol table info available.
#14 0x00007f505876712f in ?? () from /usr/lib/x86_64-linux-gnu/samba/libsmbd_base.so
No symbol table info available.
#15 0x00007f5058763e6c in ?? () from /usr/lib/x86_64-linux-gnu/samba/libsmbd_base.so
No symbol table info available.
#16 0x00007f50573e92a2 in ?? () from /usr/lib/x86_64-linux-gnu/samba/libsamba-sockets.so
No symbol table info available.
#17 0x00007f50573e8d74 in ?? () from /usr/lib/x86_64-linux-gnu/samba/libsamba-sockets.so
No symbol table info available.
#18 0x00007f50573e7bc4 in ?? () from /usr/lib/x86_64-linux-gnu/samba/libsamba-sockets.so
No symbol table info available.
#19 0x00007f5056004b92 in tevent_common_loop_immediate () from /usr/lib/x86_64-linux-gnu/libtevent.so.0
No symbol table info available.
#20 0x00007f50576338c7 in run_events_poll () from /usr/lib/x86_64-linux-gnu/libsmbconf.so.0
No symbol table info available.
---Type <return> to continue, or q <return> to quit---
Comment 3 sanved 2013-04-12 04:58:18 UTC
It seems to be that


in my vfs module making this call in the xxx_mkdir 
and nothing else (all commented out)

result = SMB_VFS_NEXT_MKDIR(handle, path, mode);

causes the core 
Not sure why?
Comment 4 Volker Lendecke 2013-04-12 06:24:30 UTC
(In reply to comment #3)
> It seems to be that
> 
> 
> in my vfs module making this call in the xxx_mkdir 
> and nothing else (all commented out)
> 
> result = SMB_VFS_NEXT_MKDIR(handle, path, mode);
> 
> causes the core 
> Not sure why?

"in my vfs module" -- does that mean you have a module that is not shipped with Samba? Can you post the source code of this module? Can you reproduce the crash without that module?
Comment 5 sanved 2013-04-12 15:05:29 UTC
Yes i do have a vfs module


I am still trying to troubleshoot this  meaning with and without my vfs module.

Though i want to lean towards saying it happens more with the vfs module.
Also in this specific mkdir path i have commented out all code except
the SMB_VFS_NEXT_MKDIR call.

on the other hand even without vfs module i sometimes see explorer restart when i 
create/rename directory and this instead of smbd core dumping causes smbdstatus to seg fault.

So i am still trying to work out where the issue lies but at this time i am trying
to reproduce it with my vfs module active.

Yes the vfs module source i can upload it soon 
Not done it  yet as i want to trim it down to minimum code and still have the issue happen so easier to analyze.
Comment 6 sanved 2013-04-12 15:43:15 UTC
Just wanted to emphasize it seems to be fine locally over LAN consistently only over VPN  i see this issue.

Also windows xp client is fine as well as Windows 7 with max proto NT1
which is windows xp behavior basically is also fine.
Comment 7 sanved 2013-04-13 21:55:46 UTC
Before i uploaded my vfs module source i wanted to run some more tests

Here is what i did

1. built a debug version of samba 4.0.4 on my debian wheezy chroot
   with following command

   ./configure --enable-debug --enable-static --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc/samba/ --libdir=/usr/lib/x86_64-linux-gnu --with-modulesdir=/usr/lib/x86_64-linux-gnu/samba/ --with-privatedir=/var/lib/samba/private --with-lockdir=/var/lock/samba --with-statedir=/var/lib/samba --with-cachedir=/var/cache/samba --with-piddir=/var/run  --disable-swat --with-logfilebase=/var/log/samba --with-sockets-dir==/var/run/samba --disable-shared-libs

   Now i replaced the smbd on the host which is a netgear 6.0 running debian
   wheezy and samba 4.0.4

   I saved off the stock smbd that came with netgear and put the debug smbd
   I compiled on to the netgear box.

   I ran the same create/rename dir over VPN and now i got a different stacktrace
   which on investigation i found what could be a small memory leak in my vfs module.

  Anyways i fixed that and then now am able to create/rename directories
  with no issues with my debug version of smbd and repeated this . several    times to make sure it is fine.


2. Then i built a non-debug version of smbd on my debian wheezy chroot
   with following command

  /configure --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc/samba/ --libdir=/usr/lib/x86_64-linux-gnu --with-modulesdir=/usr/lib/x86_64-linux-gnu/samba/ --with-privatedir=/var/lib/samba/private --with-lockdir=/var/lock/samba --with-statedir=/var/lib/samba --with-cachedir=/var/cache/samba --with-piddir=/var/run  --disable-swat --with-logfilebase=/var/log/samba --with-sockets-dir=/var/run/sambatest 
   

  Repeated same tests and it also works fine now after the memory fixes
  I did during my investigation in step 1


3.  No i went back to the stock smbd that comes with netgear 
    but still get the same stack trace i have filed in this bug
    I do not know what is happening here.
    The very first directory create/rename core dump's with the stock smbd that
    comes  with netgear.


Here is what i have noticed

1. debug version built on my debian wheezy  size is  27M
2. non-debug version built on my debian wheezy is 16M (still seems large)

3. The stock smbd that comes with netgear is 64K

   I tried to run smbd -b to see if i can spot compile options used by netgear
   so whatever few things i could see i used the same options but still
    not sure short of asking netgear what compile options were used to get to 64K

I have repeated the tests on 2 different netgear boxes with identical results
that is the smbd (debug/non-debug) i build works fine now but stock smbd
on netgear still cores very first time i create/rename directory.

The other thing i noticed from the windows 7 explorer is when i use
the smbd i built (both debug/non-debug) it is way faster than the stock
one.
Not sure what is happening.

I wonder if someone can give me a clue as how to proceed from here please?
I will upload the module source by tomorrow.
Comment 8 sanved 2013-04-14 22:03:00 UTC
Created attachment 8772 [details]
vfs module src
Comment 9 sanved 2013-04-14 22:06:50 UTC
I have now conclusively proven that by commenting out the become_root/unbecome_root in the _mkdir override of the vfs module
the core dump goes away.

I am not sure why it only seems to affect _mkdir as i have these calls
in other methods as well.

But repeated tests have conclusively proven that become_root/unbecome_root
been commented out in the _mkdir override of the vfs module fixes the core dump.

Not sure why?
Comment 10 Stefan Metzmacher 2013-04-19 20:00:26 UTC
Sorry for the confusion I wanted to reference another bug to 9794,
but I can't find the number...
Comment 11 Jeet 2013-07-15 13:10:05 UTC
i guess its resolved :)