I have some problems running samba 3.6.6-115.1-2824-SUSE-SL12.1-x86_64. Some files are not deletable via Windows. Smbstatus shows some lines and last line is: "Segmentation fault (core dumped)" There are other users with similiar problems, see: https://lists.samba.org/archive/samba/2012-July/168342.html
Without knowing what ist means, the stacktrace of smbstatus: #0 0x00007ffb7bdfb121 in __strlen_sse2 () from /lib64/libc.so.6 #1 0x00007ffb7d647ba5 in traverse_fn (_state=0x7fffabae4110, rec=<optimized out>) at locking/locking.c:1768 #2 traverse_fn (rec=<optimized out>, _state=0x7fffabae4110) at locking/locking.c:1735 #3 0x00007ffb7d682ffc in db_tdb_traverse_read_func (tdb=<optimized out>, kbuf=..., dbuf=..., private_data=<optimized out>) at lib/dbwrap_tdb.c:282 #4 0x00007ffb7c11197c in ?? () from /usr/lib64/libtdb.so.1 #5 0x00007ffb7c111aec in tdb_traverse_read () from /usr/lib64/libtdb.so.1 #6 0x00007ffb7d683246 in db_tdb_traverse_read (db=0x7ffb7d99d2c0, f=0x7ffb7d647af0 <traverse_fn>, private_data=0x7fffabae4110) at lib/dbwrap_tdb.c:296 #7 0x00007ffb7d64ba3b in share_mode_forall (fn=<optimized out>, private_data=<optimized out>) at locking/locking.c:1794 #8 0x00007ffb7d646ecb in main (argc=<optimized out>, argv=<optimized out>) at utils/status.c:473
Created attachment 7734 [details] smbstatus segfaulting - strace of
We've had the same kind of issue here with Samba 3.6.6 compiled from source. smbstatus -L or -B produce a segfault as per my attachment. Deleting /var/lib/samba/locking.tdb and restarting samba fixes it for a short time but the issues return within hours. Windows clients see symptoms like: * I delete a file and it goes away but immediately reappears on a refresh. * Sometimes the next day the file has actually deleted. * User could not reliably do an import with MS excel, possibly related to temporary files being created but not deleted. I can provide copies of a broken and a working locking.tdb if required - please advise. # ll `locate locking.tdb` -rw-r--r-- 1 root root 860160 Aug 2 10:31 /var/lib/samba/locking.tdb -rw-r--r-- 1 root root 40200 Aug 2 10:02 /var/lib/samba/save/locking.tdb # file `locate locking.tdb` /var/lib/samba/locking.tdb: TDB database version 6, little-endian hash size 10007 bytes /var/lib/samba/save/locking.tdb: TDB database version 6, little-endian hash size 10007 bytes (where save is the borked one) Downgrading to samba3x-3.5.10-0.109.el5_8 from elrepo seems to have fixed the problem.
see also this description: https://lists.samba.org/archive/samba/2012-August/168515.html
Same here, seems locking error occurrs when opened Word/Excel which give problems re-saving document (typical message, file in use, can not ovewrite error while saving. Depends on the choice of oplock-settings). After this, smbstatus crashes. Re-witten or attempted re-written Word/Excel files tend to vanish from the server after you would stop or re-run smbd. The process itself (smbd) keeps running when experiencing the problem and keeps operating; no crash. Info: Linux 3.2.5-3.fc16.x86_64 #1 SMP Thu Feb 9 01:24:38 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux smbd 3.6.6 built from sources. kill(29325, SIG_0) = 0 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2917, ...}) = 0 write(1, "29325 99 DENY_NON"..., 14229325 99 DENY_NONE 0x100080 RDONLY NONE /xx/xx/xx/xx Wed Aug 1 18:33:02 2012 ) = 142 kill(29325, SIG_0) = 0 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2917, ...}) = 0 write(1, "29325 99 DENY_NON"..., 14229325 99 DENY_NONE 0x100080 RDONLY NONE /xx/xx/xx/xx Wed Aug 1 13:40:10 2012 ) = 142 --- {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x7f9fea5feb5c} (Segmentation fault) --- +++ killed by SIGSEGV (core dumped) +++ Segmentation fault (core dumped) smb.conf: [global] kernel oplocks = no oplocks = no level2 oplocks = no # but any choice here does not improve...
Additional info: max protocol = SMB2 + x64 system
Hi I can confirm both problems Centos-6.3 x86_64 samba-3.6.6 (compiled from source) security=ADS max protocol= smb2 We have updated from 3.6.4 to 3.6.6 and after the update smbstatus sometimes crashes with Aug 5 20:36:21 rmc-euphrat kernel: smbstatus[29512]: segfault at 7f7e036aaab0 ip 00007f7d08830b5f sp 00007fff2d9e80a8 error 4 in libc-2.12.so[7f7d086ff000+189000] and sometimes files could not be deleted by the user any more Regards Hansjörg
one additional comment I have been able to reproduce the problem with a Latex GUI TeXnicCenter on a testsystem . With TeXnicCenter I was unable to write to a existing file any more. Disabling smb2 on the testsystem let the write succeed Regards Hansjörg
Can you run that under valgrind?
No, I'm sorry-I do not know how valgrid works.
Just install valgrind (possibly via "zypper install valgrind"), and run valgrind smbstatus and post the resulting output.
Hi Volker sorry for the delay, I am on vacation and I needed a user to reproduce it on a testsystem. But now I have a valgrind run (two with different options) and a debug 10 and wireshark output, The interesting point was, that the user initially was able to delete the file paper.pdf. After the core dump of smbstatus he could not delete the file any more. It disappears, but after an F5 it is back again (Windows7) The debug 10 and wireshark is from this deletion attempt Regrads Hnasjörg
Created attachment 7741 [details] valgrind two runs valgrind smbstatus and valgrind --tool=memcheck --leak-check=full -v --num-callers=20 --trace-children=yes --leak-check=full --show-reachable=yes smbstatus
Created attachment 7742 [details] wireshark try to delete paper.pdf
Created attachment 7743 [details] debug 10 trying to delte paper.pdf
(In reply to comment #13) > Created attachment 7741 [details] > valgrind > > two runs > valgrind smbstatus > and > valgrind --tool=memcheck --leak-check=full -v --num-callers=20 > --trace-children=yes --leak-check=full --show-reachable=yes smbstatus Crap, no debug symbols. Can you install the debug package for smbstatus (wherever that might be) as well?
Loaded 3.6.7 from sources: solved my problem, have tested it now for half a day. No more smbstatus crashes and using "level2 oplocks = yes" sufficient for Word/Excel, it seems. Rgds. Huibert.
Created attachment 7744 [details] valgrind smbstatus with debug rpm installed I could trigger the error only after I have deleted a file in the share Bevor it, smbstatus works fine
It also happens with version 3.6.7 and deactivated SMB2 (maybe not so often). So there is no workaround at this time.
Would be interesting to know if this is a regression in the 3.6.x series. Can somebody test with an earlier version (e.g. 3.6.5)?
with 3.6.5 it works fine The problem starts with 3.6.6 We downgraded to 3.6.5 again
Look like I got same issue. It's reproduce very easy for me: create new xlsx file, open this file, close this file (even without save), try to delete file. If I try to access this "deleted" file using smbclient - error message NT_STATUS_DELETE_PENDING. File physically deleted only after restart smb. Tested on 3.6.7. 3.6.5 work nice. Windows 7 x64, office 2010, samba-based domain. smb.conf [global] dos charset = CP866 workgroup = ***** server string = File Server BELKA interfaces = eth0, lo bind interfaces only = Yes passdb backend = ldapsam:ldap://ldap.*****.int/ log level = 3 syslog = 0 log file = /var/log/samba/log.%m max log size = 50000 debug pid = Yes debug uid = Yes max protocol = SMB2 time server = Yes add machine script = sudo /usr/sbin/smbldap-useradd -w "%u" logon path = \\%N\profiles\%U\%a logon drive = H: logon script = logon.cmd %U domain logons = Yes os level = 240 preferred master = Yes domain master = Yes wins support = Yes ldap admin dn = cn=samba,dc=***** ldap delete dn = Yes ldap group suffix = ou=groups ldap idmap suffix = ou=idmap ldap machine suffix = ou=people ldap passwd sync = yes ldap suffix = ou=samba,dc=***** ldap ssl = no ldap user suffix = ou=people admin users = root create mask = 0700 directory mask = 0700 map acl inherit = Yes nt acl support = Yes ea support = yes hide dot files = No hide special files = Yes #hide unreadable = Yes delete veto files = Yes veto files = /*.eml/*.nws/*.{*}/._*/.DS_Store/ #veto oplock files = /*.doc/*.xls/*.docx/*.xlsx/*.DOC/*.XLS/*.DOCX/*.XLSX/ map archive = No map readonly = no map system = no #store dos attributes = yes follow symlinks = No profile acls = No #socket options = TCP_NODELAY SO_SNDBUF=65536 SO_RCVBUF=65536 IPTOS_LOWDELAY load printers = yes printing = cups printcap name = cups disable spoolss = no cups options = raw #use client driver = yes cups server = localhost:631 [homes] comment = Home Directories read only = No browseable = No [profiles] path = /storage/profiles read only = No store dos attributes = Yes create mask = 0600 directory mask = 0700 browseable = no guest ok = no printable = no profile acls = yes csc policy = disable [netlogon] comment = The domain logon service path = /var/lib/samba/netlogon public = no writeable = no browsable = no [printers] comment = All Printers path = /var/spool/samba use client driver = yes printable = yes print ok = yes browseable = yes public = yes guest ok = yes writable = no read only = yes [print$] comment = Printer Drivers path = /var/lib/samba/drivers browseable = yes guest ok = no read only = yes write list = root, miksir [Projects] path = /storage/projects read only = No vfs objects = full_audit vfs_full_audit:success = aio_write chflags ftruncate mkdir mknod pwrite rename rmdir unlink write vfs_full_audit:failure = none full_audit:success = aio_write chflags ftruncate mkdir mknod pwrite rename rmdir unlink write full_audit:failure = none full_audit:facility = local5 invalid users = backup Any kind of play with options (disable oplock, full_audit etc) not helps. Share use ACL's getfacl /storage/projects getfacl: Removing leading '/' from absolute path names # file: storage/projects # owner: miksir # group: Domain\040Users user::rwx group::--- group:Domain\040Users:rwx mask::rwx other::--- default:user::rwx default:group::--- default:group:Domain\040Users:rwx default:mask::rwx default:other::--- Also I have lvl10 log of this situation, can mail it.
PS: I'm used this RPM ftp://ftp.sernet.de/pub/samba/3.6/centos/6/i386
Steps to reproduce samba 3.6.6 und Cents-6.3 x_86_64 Client Windows 7 x64 with MS Excel 2010 smb.conf below * Connect with MS Explorer to samba server as user maurerh to \\ftp\tmp\maurerh * RightClick "New Microsoft Excel Sheet" (Translatet from germany" [root@rmc-donau x86_64]# smbstatus Samba version 3.6.6-itsd.el6 PID Username Group Machine ------------------------------------------------------------------- 11375 maurerh dlr_maurerh_p 129.247.189.18 (129.247.189.18) 11376 maurerh dlr_maurerh_p 10.97.0.152 (10.97.0.152) Service pid machine Connected at ------------------------------------------------------- tmp 11376 10.97.0.152 Mon Aug 20 10:11:59 2012 ftp 11375 129.247.189.18 Mon Aug 20 10:11:58 2012 Locked files: Pid Uid DenyMode Access R/W Oplock SharePath Name Time -------------------------------------------------------------------------------------------------- 11375 7740 DENY_NONE 0x100081 RDONLY NONE /home_local/ftp outgoing Mon Aug 20 10:12:00 2012 11375 7740 DENY_NONE 0x100081 RDONLY NONE /home_local/ftp . Mon Aug 20 10:11:59 2012 11375 7740 DENY_NONE 0x100081 RDONLY NONE /home_local/ftp . Mon Aug 20 10:12:00 2012 11376 7740 DENY_NONE 0x100081 RDONLY NONE /home_local/tmp maurerh Mon Aug 20 10:17:44 2012 11376 7740 DENY_NONE 0x120196 WRONLY BATCH /home_local/tmp maurerh/Microsoft Excel-Arbeitsblatt (neu).xlsx Mon Aug 20 10:17:47 2012 11376 7740 DENY_NONE 0x100081 RDONLY NONE /home_local/tmp . Mon Aug 20 10:11:59 2012 11376 7740 DENY_ALL 0x100080 RDONLY NONE /home_local/tmp . Mon Aug 20 10:13:11 2012 * RMB on the Excel File -> Open [root@rmc-donau x86_64]# smbstatus Samba version 3.6.6-itsd.el6 PID Username Group Machine ------------------------------------------------------------------- 11375 maurerh dlr_maurerh_p 129.247.189.18 (129.247.189.18) 11376 maurerh dlr_maurerh_p 10.97.0.152 (10.97.0.152) Service pid machine Connected at ------------------------------------------------------- tmp 11376 10.97.0.152 Mon Aug 20 10:11:59 2012 ftp 11375 129.247.189.18 Mon Aug 20 10:11:58 2012 Locked files: Pid Uid DenyMode Access R/W Oplock SharePath Name Time -------------------------------------------------------------------------------------------------- 11375 7740 DENY_NONE 0x100081 RDONLY NONE /home_local/ftp outgoing Mon Aug 20 10:12:00 2012 11376 7740 DENY_WRITE 0x13019f RDWR BATCH /home_local/tmp maurerh/~$Microsoft Excel-Arbeitsblatt (neu).xlsx Mon Aug 20 10:18:21 2012 11375 7740 DENY_NONE 0x100081 RDONLY NONE /home_local/ftp . Mon Aug 20 10:11:59 2012 11375 7740 DENY_NONE 0x100081 RDONLY NONE /home_local/ftp . Mon Aug 20 10:12:00 2012 11376 7740 DENY_NONE 0x100081 RDONLY NONE /home_local/tmp maurerh Mon Aug 20 10:17:44 2012 11376 7740 DENY_NONE 0x100080 RDONLY NONE /home_local/tmp maurerh Mon Aug 20 10:18:21 2012 11376 7740 DENY_NONE 0x100080 RDONLY NONE /home_local/tmp maurerh Mon Aug 20 10:18:21 2012 11376 7740 DENY_NONE 0x100080 RDONLY NONE /home_local/tmp maurerh Mon Aug 20 10:18:21 2012 11376 7740 DENY_NONE 0x100080 RDONLY NONE /home_local/tmp maurerh Mon Aug 20 10:18:21 2012 11376 7740 DENY_NONE 0x100080 RDONLY NONE /home_local/tmp maurerh Mon Aug 20 10:18:21 2012 11376 7740 DENY_WRITE 0x12019f RDWR NONE /home_local/tmp maurerh/Microsoft Excel-Arbeitsblatt (neu).xlsx Mon Aug 20 10:18:21 2012 11376 7740 DENY_NONE 0x100080 RDONLY NONE /home_local/tmp maurerh/Microsoft Excel-Arbeitsblatt (neu).xlsx Mon Aug 20 10:18:21 2012 11376 7740 DENY_NONE 0x100080 RDONLY NONE /home_local/tmp maurerh/Microsoft Excel-Arbeitsblatt (neu).xlsx Mon Aug 20 10:18:21 2012 11376 7740 DENY_NONE 0x100080 RDONLY NONE /home_local/tmp maurerh/Microsoft Excel-Arbeitsblatt (neu).xlsx Mon Aug 20 10:18:21 2012 11376 7740 DENY_NONE 0x100081 RDONLY NONE /home_local/tmp . Mon Aug 20 10:11:59 2012 * just print to the save Icon [root@rmc-donau x86_64]# smbstatus Samba version 3.6.6-itsd.el6 PID Username Group Machine ------------------------------------------------------------------- 11375 maurerh dlr_maurerh_p 129.247.189.18 (129.247.189.18) 11376 maurerh dlr_maurerh_p 10.97.0.152 (10.97.0.152) Service pid machine Connected at ------------------------------------------------------- tmp 11376 10.97.0.152 Mon Aug 20 10:11:59 2012 ftp 11375 129.247.189.18 Mon Aug 20 10:11:58 2012 Locked files: Pid Uid DenyMode Access R/W Oplock SharePath Name Time -------------------------------------------------------------------------------------------------- 11375 7740 DENY_NONE 0x100081 RDONLY NONE /home_local/ftp outgoing Mon Aug 20 10:12:00 2012 11376 7740 DENY_WRITE 0x13019f RDWR BATCH /home_local/tmp maurerh/~$Microsoft Excel-Arbeitsblatt (neu).xlsx Mon Aug 20 10:18:21 2012 11375 7740 DENY_NONE 0x100081 RDONLY NONE /home_local/ftp . Mon Aug 20 10:11:59 2012 11375 7740 DENY_NONE 0x100081 RDONLY NONE /home_local/ftp . Mon Aug 20 10:12:00 2012 11376 7740 DENY_NONE 0x100081 RDONLY NONE /home_local/tmp maurerh Mon Aug 20 10:17:44 2012 11376 7740 DENY_NONE 0x100080 RDONLY NONE /home_local/tmp maurerh Mon Aug 20 10:18:21 2012 11376 7740 DENY_NONE 0x100080 RDONLY NONE /home_local/tmp maurerh Mon Aug 20 10:18:21 2012 11376 7740 DENY_NONE 0x100080 RDONLY NONE /home_local/tmp maurerh Mon Aug 20 10:18:21 2012 11376 7740 DENY_NONE 0x100080 RDONLY NONE /home_local/tmp maurerh Mon Aug 20 10:18:21 2012 11376 7740 DENY_NONE 0x100080 RDONLY NONE /home_local/tmp maurerh Mon Aug 20 10:18:21 2012 Segmentation fault (core dumped) * Close Excel (*.tmp and Excel File remain) * Delete the Files in Explorer, Press F5, *.tmp reappears * restart smbd -> file disappears [root@rmc-donau x86_64]# valgrind smbstatus ==11536== Memcheck, a memory error detector ==11536== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al. ==11536== Using Valgrind-3.6.0 and LibVEX; rerun with -h for copyright info ==11536== Command: smbstatus ==11536== Samba version 3.6.6-itsd.el6 PID Username Group Machine ------------------------------------------------------------------- 11375 maurerh dlr_maurerh_p 129.247.189.18 (129.247.189.18) 11376 maurerh dlr_maurerh_p 10.97.0.152 (10.97.0.152) Service pid machine Connected at ------------------------------------------------------- tmp 11376 10.97.0.152 Mon Aug 20 10:11:59 2012 ftp 11375 129.247.189.18 Mon Aug 20 10:11:58 2012 Locked files: Pid Uid DenyMode Access R/W Oplock SharePath Name Time -------------------------------------------------------------------------------------------------- 11375 7740 DENY_NONE 0x100081 RDONLY NONE /home_local/ftp outgoing Mon Aug 20 10:12:00 2012 11375 7740 DENY_NONE 0x100081 RDONLY NONE /home_local/ftp . Mon Aug 20 10:11:59 2012 11375 7740 DENY_NONE 0x100081 RDONLY NONE /home_local/ftp . Mon Aug 20 10:12:00 2012 11376 7740 DENY_NONE 0x100081 RDONLY NONE /home_local/tmp maurerh Mon Aug 20 10:17:44 2012 11376 7740 DENY_NONE 0x100080 RDONLY NONE /home_local/tmp maurerh Mon Aug 20 10:18:21 2012 11376 7740 DENY_NONE 0x100080 RDONLY NONE /home_local/tmp maurerh Mon Aug 20 10:18:21 2012 11376 7740 DENY_NONE 0x100080 RDONLY NONE /home_local/tmp maurerh Mon Aug 20 10:18:21 2012 11376 7740 DENY_NONE 0x100080 RDONLY NONE /home_local/tmp maurerh Mon Aug 20 10:18:21 2012 11376 7740 DENY_NONE 0x100080 RDONLY NONE /home_local/tmp maurerh Mon Aug 20 10:18:21 2012 11376 7740 DENY_NONE 0x100080 RDONLY NONE /home_local/tmp maurerh Mon Aug 20 10:20:44 2012 11376 7740 DENY_NONE 0x100080 RDONLY NONE /home_local/tmp maurerh Mon Aug 20 10:20:44 2012 11376 7740 DENY_NONE 0x100080 RDONLY NONE /home_local/tmp maurerh Mon Aug 20 10:20:44 2012 ==11536== Invalid read of size 1 ==11536== at 0x4C287D2: strlen (mc_replace_strmem.c:282) ==11536== by 0x15FA50: traverse_fn (locking.c:1768) ==11536== by 0x19B0AB: db_tdb_traverse_read_func (dbwrap_tdb.c:282) ==11536== by 0x1EEAE4: tdb_traverse_internal (traverse.c:190) ==11536== by 0x1EEC5B: tdb_traverse_read (traverse.c:229) ==11536== by 0x19B525: db_tdb_traverse_read (dbwrap_tdb.c:296) ==11536== by 0x15F99A: share_mode_forall (locking.c:1794) ==11536== by 0x15EFCD: main (status.c:473) ==11536== Address 0x8c561a04 is not stack'd, malloc'd or (recently) free'd ==11536== ==11536== ==11536== Process terminating with default action of signal 11 (SIGSEGV) ==11536== Access not within mapped region at address 0x8C561A04 ==11536== at 0x4C287D2: strlen (mc_replace_strmem.c:282) ==11536== by 0x15FA50: traverse_fn (locking.c:1768) ==11536== by 0x19B0AB: db_tdb_traverse_read_func (dbwrap_tdb.c:282) ==11536== by 0x1EEAE4: tdb_traverse_internal (traverse.c:190) ==11536== by 0x1EEC5B: tdb_traverse_read (traverse.c:229) ==11536== by 0x19B525: db_tdb_traverse_read (dbwrap_tdb.c:296) ==11536== by 0x15F99A: share_mode_forall (locking.c:1794) ==11536== by 0x15EFCD: main (status.c:473) ==11536== If you believe this happened as a result of a stack ==11536== overflow in your program's main thread (unlikely but ==11536== possible), you can try to increase the size of the ==11536== main thread stack using the --main-stacksize= flag. ==11536== The main thread stack size used in this run was 10485760. ==11536== ==11536== HEAP SUMMARY: ==11536== in use at exit: 117,188 bytes in 1,203 blocks ==11536== total heap usage: 6,110 allocs, 4,907 frees, 2,547,378 bytes allocated ==11536== ==11536== LEAK SUMMARY: ==11536== definitely lost: 1,112 bytes in 5 blocks ==11536== indirectly lost: 90,914 bytes in 980 blocks ==11536== possibly lost: 12,646 bytes in 109 blocks ==11536== still reachable: 12,516 bytes in 109 blocks ==11536== suppressed: 0 bytes in 0 blocks ==11536== Rerun with --leak-check=full to see details of leaked memory ==11536== ==11536== For counts of detected and suppressed errors, rerun with: -v ==11536== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 6 from 6) Segmentation fault (core dumped) [global] workgroup = XXX realm = INTRA.XXX.DE netbios name = FTPSERVER server string = RM-FTP-Server interfaces = 127.0.0.1, eth0 bind interfaces only = Yes security = ADS password server = * username map = /etc/samba/smbusers log level = 1 syslog = 0 log file = /var/log/samba/log.%m printcap name = /dev/null machine password timeout = 604800 os level = 25 preferred master = No local master = No domain master = No dns proxy = No encrypt passwords = yes idmap config * : backend = tdb idmap config * : range = 1000001-1999999 idmap config XXX : backend = ad idmap config XXX : schema_mode = rfc2307 idmap config XXX : readonly = yes idmap config XXX : range = 1000-1000000 max protocol = smb2 wins server = create mask = 0664 directory mask = 0775 use sendfile = Yes hide dot files = No map archive = No dont descend = lost+found load printers= no printing = bsd printcap name = /dev/null [tmp] path = /home_local/tmp comment = tmp-Share browseable = yes writeable = yes wide links = no
I tried to reproduce this issue in my testenv with CentOS and Samba 3.6.6 without success. I can create, write, rewrite and delete all files. I'm also able to delete the ~$ files. Office 2010 does not remove these ~$ files automatically, but I see the same on a share on a Windows 2008 r2 server. After the second open and re-write, Office does not delete them. Typically Windows users don't see these files, because the files have the hidden flag AND they are so called "protected operating system files". If you want to hide the files you need to enable the "store dos attributes" option. In the next step I'll build 3.6.6 from source on a clean CentOS 6.3 64bit and try it again. best regards Björn
Hi I think the key point to the problem is, that if I open a new Excel file with Office2010 on a 3.6.6 server with max protocol = smb2 , and klick on the save button, a following smbstatus dumbs core Can you reproduce this? Without max protocol = smb2 - smbstatus did not dump core - the *.tmp file is automatically removed on Close of Excel With max protocol = smb2 - smbstatus dumps core - the *.tmp file stays on disk after Close of Excel - after a restart of smbd it disappears I think this fits to the obervation of NT_STATUS_DELETE_PENDING at comment #22 Can I provide further information concerning the core dump of smbstatus? Regards Hansjörg
It's really bad, but no matter how hard we try, we don't get this reproduced. We spent 2 full person days now trying to catch that problem. I think someone suffering from this needs to contract someone from http://www.samba.org/samba/support to be able to remotely log in and see the problem happening.
If you want me to provide you ssh access to a machine, where we can reproduce the problem, let me know and I will contact you with the required information
Similar issue(not sure if it's the same one) can be reproduced in 3.6.6 but not in 3.6.4: 1. compile the following program, say we get open_file.exe #include <stdio.h> #include <tchar.h> #include <windows.h> int _tmain(int argc, _TCHAR* argv[]) { HANDLE h; h = CreateFile(argv[1], GENERIC_READ, FILE_SHARE_READ|FILE_SHARE_WRITE|FILE_SHARE_DELETE, NULL, OPEN_EXISTING, 0, NULL); printf("hit <cr>\n"); getchar(); CloseHandle(h); return 0; } 2. set HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\LanmanWorkstation\Parameters\DirectoryCacheLifetime to 0 3. run "open_file <an existing file on the samba share>". Don't press Enter key. 4. In Windows Explorer, delete the above file, and then refresh the folder, the file reappears. 5. run smbstatus, it crashes with the following stack trace: core '/var/core/core.smbstatus.3176' of 3176: /.../samba/bin/smbstatus fffffd7ffd2afcf4 strlen () + 14 000000000049cf25 db_tdb_traverse_read_func () + 55 fffffd7fec73abda tdb_traverse_internal () + ba fffffd7fec73ad3d tdb_traverse_read () + 6d 000000000049d166 db_tdb_traverse_read () + 56 00000000004682cf share_mode_forall () + 2f 00000000004ebe04 main () + 7e4 00000000004613ac _start () + 6c Thanks to Ken Harris for the reproduction steps.
Can you send me the binary?
Attach the binary to the bug then we can both look at it :-). Thanks, Jeremy.
Created attachment 7801 [details] .zip of open_file binary (+source) Here's the test program binary for reproduction as per Youzhong's instructions. The program holds a handle to a file (argv[1]) until you hit <cr>. Enjoy! -ken
Ok, reproduced this - working on a fix right now... Jeremy.
traverse_fn() inside source3/locking/locking.c wasn't correctly updated when we moved to NDR encoding for the delete on close entries. This is my fault (I remember doing this back-port). I'll fix asap. Sorry for the bug. Jeremy.
Created attachment 7802 [details] Raw patch that fixes the problem here. Ok, here's the fix. I forgot to update the traverse_fn() to correctly use the parse_share_modes() function when I added the backport of the NDR encoding of the delete on close tokens. So you'd never see this bug unless you had delete on close tokens stored in the locking.tdb, which is why we couldn't reproduce it. Please test and if everyone reports it fixes the issue (and Volker gives a + review) then we're good to go. Sorry for the error. Jeremy.
Created attachment 7803 [details] Updated fix that correctly types 'i' as int, not uint32_t.
Hi thanks for the fix. I applied it to 3.6.7 and smbstatus did not core dump any more now. But the excel tmp file maurerh/769C49BC.tmp could still not be deleted form the share. When I try to delete it from the client, after an F5 it reappears and it still is on the linux filesystem. If I restart smb it disappears automatically form the client and from the filesystem. Should I take an additional wireshark trace or level10 log? Regards Hansjörg
(In reply to comment #37) > Hi > > thanks for the fix. > I applied it to 3.6.7 and smbstatus did not core dump any more now. > > But the excel tmp file maurerh/769C49BC.tmp could still not be deleted form > the share. > When I try to delete it from the client, after an F5 it reappears and it still > is on the linux filesystem. > If I restart smb it disappears automatically form the client and from the > filesystem. > > Should I take an additional wireshark trace or level10 log? A debug level 10 log would be interesting. As it is probably very large, what we need is the timestamp when you tried to delete the file, so that we can find it in the logs. Also, we need the full logs, with "max log size = 0". Volker
Created attachment 7809 [details] git-am fix for 3.6.next. Fix for the smbstatus crash for 3.6.next.
Re-assigning to Karolin to get the smbstatus fix merged now vl has approved the fix. If we can get good data on the Excel issue we can revisit this. Jeremy.
Hi took a level10 debug and a wireshark capture node 10.97.0.152 (no action from other nodes) First column is the time i noted while performing the action I tried to identifiy the action in the wireshark trace on the samba server and noted the exact time stamp an Packet Number from wireshark in order to make it easier to find it in the debug file 8:21:45 open off Microsoft Excel-Arbeitsmappe and edit 8:22:10 Save ( ~Packet 480 or 08:22:11.525647 in log.10.97.0.152_debug) 8:22:20 Close of Excel F2E79813.tmp still here (without max protocol = smb2 it is deleted (~Paket 962 08:22:21.361378 in log.10.97.0.152_debug ) [2012/08/23 08:23:22.495651, 5] smbd/close.c:398(close_remove_share_mode) close_remove_share_mode: file maurerh/F2E79813.tmp. Delete on close was set - deleting file. 8:22:50 manual delete off F2E79813.tmp ( ~ Paket 1062 08:22:49.404191 in log.10.97.0.152_debug) 8:23:05 Press F5 F2E79813.tmp still here ( ~ Paket 1091 08:23:06.114077 in log.10.97.0.152_debug ) automaticalliy deleted on restart of smbd at about 08:23:22.495651
Created attachment 7812 [details] wireshark trace on samba server
Created attachment 7813 [details] debug10 file log.%m
Created attachment 7814 [details] debug10 log.smbd
(In reply to comment #42) > Created attachment 7812 [details] > wireshark trace on samba server Hi: FWIW, I looked at smb_wireshark trace and noticed the following: 1. Packet #338 comprises compounded CREATE and IOCTL requests on "Microsoft Excel-Arbeitsblatt (neu).xlsx" 2. Packet #339 contains the replies to #338, including a handle (76d83ddb-3d04-d7d0-db3d-000000000000) which is _never_ closed in this wireshare trace. 3. Packet #560 contains a SETINFO RENAME request, which renames "Microsoft Excel-Arbeitsblatt (neu).xlsx" -> "F2E79813.tmp" 4. Packet #580 contains the SETINFO DELETE request. So long as the handle returned in #338 remains unclosed, I would expect "F2E79813.tmp" to remain undeleted and inaccessible; but what changed between 3.6.5 and 3.6.6 that would influence the client in this way? Hansjörg, perhaps you could do a wireshark capture with 3.6.5 and compare? Hope this helps, -Ken
Created attachment 7816 [details] Patch Can you please try the attached patch?
(In reply to comment #45) > (In reply to comment #42) > > Created attachment 7812 [details] [details] > > wireshark trace on samba server > > Hi: FWIW, I looked at smb_wireshark trace and noticed the following: > > 1. Packet #338 comprises compounded CREATE and IOCTL requests on "Microsoft > Excel-Arbeitsblatt (neu).xlsx" > 2. Packet #339 contains the replies to #338, including a handle > (76d83ddb-3d04-d7d0-db3d-000000000000) which is _never_ closed in this > wireshare trace. > 3. Packet #560 contains a SETINFO RENAME request, which renames "Microsoft > Excel-Arbeitsblatt (neu).xlsx" -> "F2E79813.tmp" > 4. Packet #580 contains the SETINFO DELETE request. > > So long as the handle returned in #338 remains unclosed, I would expect > "F2E79813.tmp" to remain undeleted and inaccessible; but what changed between > 3.6.5 and 3.6.6 that would influence the client in this way? > > Hansjörg, perhaps you could do a wireshark capture with 3.6.5 and compare? > > Hope this helps, Yep, came to exactly the same conclusion. The patch just posted should fix it. Thanks! Volker
Probably the - if (req->compat_chain_fsp) { in 99807792d743b7f258f0d433689b6231b260bddf is the one that caused this. Not sure how this applies in the big picture of the file_id changes, but the error is due to smb2_ioctl not transferring the chain_fsp in a compound request properly
BTW, Hans-Jörg, thanks for the patience and exact logfiles.
(In reply to comment #46) > Created attachment 7816 [details] > Patch > > Can you please try the attached patch? The 'default:' branch should look like this + in_fsp = file_fsp_smb2(req, in_file_id_persistent, + in_file_id_volatile); + if (in_fsp == NULL) { + return smbd_smb2_request_error(req, NT_STATUS_FILE_CLOSED); + } Like it's done in master, otherwise you revert parts of 99807792d743b7f258f0d433689b6231b260bddf and in_fsp is always NULL.
Created attachment 7817 [details] git-am version of Volker's fix with metze's additional change. Metze, Volker - this is the combination of Volker's patch and your requested change. Let me know if it's correct. Testers, please try this one and report back if it fixes the error. Thanks ! Jeremy.
Thank you, i am offsite at the moment, will test it on monday
Marking this a "blocker" for next release. At least we have a patch to be tested :-).
Comment on attachment 7817 [details] git-am version of Volker's fix with metze's additional change. Fixes my test here. But we should wait for the reporters to confirm.
Hi tested it and the tmp files are removed now. Actually now when excel is open I do not see a persistent *.tmp file any more but a file named ~$Microsoft Excel-Arbeitsblatt (neu).xlsx which is removed after the close of excel Thank you very much for your attention and for the impressiv quick fix. I know it is not common or even not polite with so many followers to raise the focus on another strange problem that exists in the whole 3.6.x series. It is bug https://bugzilla.samba.org/show_bug.cgi?id=8881 It is nor critical because a workarround ( https://bugzilla.samba.org/show_bug.cgi?id=8881#c3 ) exists, but a little bit anoying... If someone would be able to take a look at it, I can do further testing Regards Hansjörg
Pushed to v3-6-test. Will be included in the next bugfix release. Closing out bug report. Thanks!
(In reply to comment #56) > Pushed to v3-6-test. > Will be included in the next bugfix release. > Closing out bug report. > > Thanks! Does it contain the fix for smbstatus crash? Thanks.
(In reply to comment #57) > (In reply to comment #56) > > Pushed to v3-6-test. > > Will be included in the next bugfix release. > > Closing out bug report. > > > > Thanks! > > Does it contain the fix for smbstatus crash? Thanks. Yes, exactly.
Karolin, I just tested this on a local system. It currently crashes, using v3-6-test. It needs the patch from Jeremy that Volker +'d "git-am fix for 3.6.next." Due to the crash still being there, I am reopening the issue. -Ira
(In reply to comment #59) > Karolin, > > I just tested this on a local system. It currently crashes, using v3-6-test. > > It needs the patch from Jeremy that Volker +'d "git-am fix for 3.6.next." > > Due to the crash still being there, I am reopening the issue. > > -Ira Ira, thanks a lot for catching this one! Pushed to v3-6-test. Closing out bug report. Thanks!