Bug 3817 - smbd won't accept new connections, appears locking problem
Summary: smbd won't accept new connections, appears locking problem
Status: RESOLVED INVALID
Alias: None
Product: Samba 3.0
Classification: Unclassified
Component: User/Group Accounts (show other bugs)
Version: 3.0.22
Hardware: x86 Linux
: P3 critical
Target Milestone: none
Assignee: Gerald (Jerry) Carter (dead mail address)
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-06-01 03:05 UTC by Wilco Baan Hofman
Modified: 2006-06-21 07:25 UTC (History)
0 users

See Also:


Attachments
Output of smbstatus (80.66 KB, text/plain)
2006-06-01 03:06 UTC, Wilco Baan Hofman
no flags Details
Backtraces of all smbd processes (374.98 KB, text/plain)
2006-06-01 03:43 UTC, Wilco Baan Hofman
no flags Details
Backtraces of all processes without /lib/tls (278.47 KB, text/plain)
2006-06-14 03:58 UTC, Wilco Baan Hofman
no flags Details
Backtrace of all smbd processes, at another customer site (359.81 KB, text/plain)
2006-06-14 05:06 UTC, Wilco Baan Hofman
no flags Details
Level 10 debug log w/ 3.0.14a and pthread (593.89 KB, text/plain)
2006-06-21 07:18 UTC, Wilco Baan Hofman
no flags Details
Backtrace corresponding with the log (8.34 KB, text/plain)
2006-06-21 07:19 UTC, Wilco Baan Hofman
no flags Details
libpcap format network dump file (binary) (741.70 KB, application/octet-stream)
2006-06-21 07:20 UTC, Wilco Baan Hofman
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Wilco Baan Hofman 2006-06-01 03:05:08 UTC
Hi, I have the following problem at a customer site:

Symptoms:
 - I can do authentification, but logging onto a share hangs it, until timeout
 - I have 188 open smbd processes, with only 5-10 clients logged in
 - Almost every smbd process hangs in fcntl in tdb_brlock (waiting for a lock to disappear?)
 - approximatly 300 locks are assigned (attaching smbstatus output)
 - Happens every day or every other day.
 - Tracing the main daemon shows it runs out of new smbd processes to spawn (as far as I understand it)
 - Killing all smbd processes fixes it for a day or two.

On child (with the lowest pid) hangs differently:
#0  0xb7eaf189 in pthread_setcanceltype () from /lib/tls/libc.so.6
#1  0xb7e528a8 in adjtime () from /lib/tls/libc.so.6
#2  0xb7e5335e in tzset () from /lib/tls/libc.so.6
#3  0xb7e58406 in strftime_l () from /lib/tls/libc.so.6
#4  0xb7e5835f in strftime () from /lib/tls/libc.so.6
#5  0x081adf25 in timestring (hires=0) at lib/time.c:722
#6  0x081a9484 in dbghdr (level=0, file=0x8270f80 "lib/fault.c",
    func=0x8270f73 "fault_report", line=36) at lib/debug.c:947
#7  0x081aa34e in sig_fault (sig=6) at lib/fault.c:36
#8  <signal handler called>
#9  0xb7df97c7 in raise () from /lib/tls/libc.so.6
#10 0xb7dfb06b in abort () from /lib/tls/libc.so.6
#11 0xb7e30545 in __fsetlocking () from /lib/tls/libc.so.6
#12 0xb7e36b97 in malloc_usable_size () from /lib/tls/libc.so.6
#13 0xb7e37032 in free () from /lib/tls/libc.so.6
#14 0x0817f5b1 in get_share_modes (conn=0x8342958, dev=2050, inode=2622823,
    pp_shares=0xbfcc30f4) at locking/locking.c:523
#15 0x080b7c01 in open_mode_check (conn=0x8342958,
    fname=0xbfcc3660 "Ambrasoft/Tafel Totaal/TSchlMenu.exe", dev=2050,
    inode=2622823, desired_access=<error type>, share_mode=64,
    p_flags=0xbfcc31d0, p_oplock_request=0xbfcc320c,
    p_all_current_opens_are_level_II=0xbfcc31cc) at smbd/open.c:611
#16 0x080b964f in open_file_shared1 (conn=0x8342958,
    fname=0xbfcc3660 "Ambrasoft/Tafel Totaal/TSchlMenu.exe", psbuf=0xbfcc3a60,
    desired_access=<error type>, share_mode=64, ofun=1,
    new_dos_mode=<error type>, oplock_request=3, Access=0xbfcc3ad0,
    paction=0xbfcc3acc) at smbd/open.c:1181
#17 0x08096fae in reply_ntcreate_and_X (conn=0x8342958, inbuf=0xb7afd008 "",
    outbuf=0xb79f5008 "", length=164, bufsize=131072) at smbd/nttrans.c:844
#18 0x080cb324 in switch_message (type=0, inbuf=0xb7afd008 "",
    outbuf=0xb79f5008 "", size=164, bufsize=131072) at smbd/process.c:968
#19 0x080cb6e1 in process_smb (inbuf=0xb7afd008 "", outbuf=0xb79f5008 "")
    at smbd/process.c:998
#20 0x080cc3e3 in smbd_process () at smbd/process.c:1558
#21 0x0821e473 in main (argc=2, argv=0xbfcc4ff4) at smbd/server.c:951

Most other childs:
#0  0xb7e9255e in fcntl () from /lib/tls/libc.so.6
#1  0x081cd4bf in tdb_brlock (tdb=0x8324980, offset=<error type>, rw_type=0, lck_type=14, probe=0) at tdb/tdb.c:253
#2  0x081cd68b in tdb_lock (tdb=0x8324980, list=106, ltype=0) at tdb/tdb.c:307
#3  0x081ce551 in tdb_find_lock_hash (tdb=0x8324980, key={dptr = 0x82a39a0 "DOE", dsize = <error type>}, hash=<error type>, locktype=0, rec=0xbfcc2c24)
    at tdb/tdb.c:1045
#4  0x081ce660 in tdb_fetch (tdb=0x8324980, key={dptr = 0x82a39a0 "DOE", dsize = <error type>}) at tdb/tdb.c:1127
#5  0x0818081d in get_deferred_opens (conn=0x832da10, dev=2049, inode=1096225, pp_de_entries=0xbfcc3114) at locking/locking.c:975
#6  0x080b8d4d in delete_defered_open_entry_record (conn=0x832da10, dev=2049, inode=1096225) at smbd/open.c:803
#7  0x080b9998 in open_file_shared1 (conn=0x832da10, fname=0xbfcc3660 "Sjablonen/prf92.tmp", psbuf=0xbfcc3a60, desired_access=<error type>, share_mode=16,
    ofun=16, new_dos_mode=<error type>, oplock_request=3, Access=0xbfcc3ad0, paction=0xbfcc3acc) at smbd/open.c:1491
#8  0x08096fae in reply_ntcreate_and_X (conn=0x832da10, inbuf=0xb7afd008 "", outbuf=0xb79f5008 "", length=130, bufsize=131072) at smbd/nttrans.c:844
#9  0x080cb324 in switch_message (type=-512, inbuf=0xb7afd008 "", outbuf=0xb79f5008 "", size=130, bufsize=131072) at smbd/process.c:968
#10 0x080cb6e1 in process_smb (inbuf=0xb7afd008 "", outbuf=0xb79f5008 "") at smbd/process.c:998
#11 0x080cc3e3 in smbd_process () at smbd/process.c:1558
#12 0x0821e473 in main (argc=2, argv=0xbfcc4ff4) at smbd/server.c:951

However, this is a production environment and the customer is going to kill us if we don't fix it fast.. so I might have to kill the smbd processes again within a few hours.
Comment 1 Wilco Baan Hofman 2006-06-01 03:06:42 UTC
Created attachment 1927 [details]
Output of smbstatus
Comment 2 Wilco Baan Hofman 2006-06-01 03:10:13 UTC
Oh, forgot to tell, backtrace is of samba 3.0.14 (have identical problem with 3.0.22, but this was easier for debugging right now)
Comment 3 Wilco Baan Hofman 2006-06-01 03:41:10 UTC
I had to kill the process which allegedly caused the problem, now 231 smbd processes, but auth works again. Still a lot fcntl processes left. I'll make backtraces of all smbd processes and attach the file. 

Ohyeah, no relevant log messages found as it just hangs after a while (tried debuglevel 10)



Comment 4 Wilco Baan Hofman 2006-06-01 03:43:00 UTC
Created attachment 1928 [details]
Backtraces of all smbd processes
Comment 5 Wilco Baan Hofman 2006-06-01 03:49:41 UTC
I've just killed 13022 which had the same problem (see full backtrace attachment) and all is running fine again, back to 9 smbd processes and a handful of locks.

Downside is I just lost myself the testcase, 'til day after tomorrow I'm afraid..
Comment 6 Gerald (Jerry) Carter (dead mail address) 2006-06-01 06:47:45 UTC
This is probably caused by linking with the pthreads library.
Do you know how that got linked in?
Comment 7 Wilco Baan Hofman 2006-06-01 07:22:33 UTC
libpthread is linked in through libldap_r.so.2.
Comment 8 Wilco Baan Hofman 2006-06-01 10:37:07 UTC
I do use the passdb ldap backend. Any suggestions?
Comment 9 Wilco Baan Hofman 2006-06-14 03:58:30 UTC
Created attachment 1959 [details]
Backtraces of all processes without /lib/tls
Comment 10 Wilco Baan Hofman 2006-06-14 04:01:33 UTC
This is probably the relevant backtrace (without /lib/tls):

0xb7d35e94 in __pthread_sigsuspend () from /lib/libpthread.so.0
#0  0xb7d35e94 in __pthread_sigsuspend () from /lib/libpthread.so.0
#1  0xb7d34d68 in __pthread_wait_for_restart_signal () from /lib/libpthread.so.0
#2  0xb7d37716 in __pthread_alt_lock () from /lib/libpthread.so.0
#3  0xb7d3428c in pthread_mutex_lock () from /lib/libpthread.so.0
#4  0xb7e5f9c2 in free () from /lib/libc.so.6
#5  0xb7e79428 in adjtime () from /lib/libc.so.6
#6  0xb7e79ecb in tzset () from /lib/libc.so.6
#7  0xb7e7ebe3 in strftime_l () from /lib/libc.so.6
#8  0xb7e7ea68 in strftime () from /lib/libc.so.6
#9  0x081adf25 in timestring (hires=0) at lib/time.c:722
#10 0x081a9484 in dbghdr (level=0, file=0x8270f80 "lib/fault.c", 
    func=0x8270f73 "fault_report", line=36) at lib/debug.c:947
#11 0x081aa34e in sig_fault (sig=6) at lib/fault.c:36
#12 0xb7d38d26 in __pthread_sighandler () from /lib/libpthread.so.0
#13 <signal handler called>
#14 0xb7e236f1 in kill () from /lib/libc.so.6
#15 0xb7d36030 in pthread_kill () from /lib/libpthread.so.0
#16 0xb7d363fb in raise () from /lib/libpthread.so.0
#17 0xb7e23464 in raise () from /lib/libc.so.6
#18 0xb7e249d8 in abort () from /lib/libc.so.6
#19 0xb7e5911f in __fsetlocking () from /lib/libc.so.6
#20 0xb7e5efb2 in malloc_usable_size () from /lib/libc.so.6
#21 0xb7e5f978 in free () from /lib/libc.so.6
#22 0x0817f5b1 in get_share_modes (conn=0x832e490, dev=2050, inode=6358381, 
    pp_shares=0xbfed7c64) at locking/locking.c:523
#23 0x080b7c01 in open_mode_check (conn=0x832e490, 
    fname=0xbfed81d0 "Ambrasoft/RekenSOM/rschlmenu.exe", dev=2050, inode=6358381, 
    desired_access=<error type>, share_mode=32800, p_flags=0xbfed7d40, 
    p_oplock_request=0xbfed7d7c, p_all_current_opens_are_level_II=0xbfed7d3c)
    at smbd/open.c:611
#24 0x080b964f in open_file_shared1 (conn=0x832e490, 
    fname=0xbfed81d0 "Ambrasoft/RekenSOM/rschlmenu.exe", psbuf=0xbfed85d0, 
    desired_access=<error type>, share_mode=32800, ofun=1, 
    new_dos_mode=<error type>, oplock_request=3, Access=0xbfed8640, 
    paction=0xbfed863c) at smbd/open.c:1181
#25 0x08096fae in reply_ntcreate_and_X (conn=0x832e490, inbuf=0xb7ae9008 "", 
    outbuf=0xb79e1008 "", length=156, bufsize=131072) at smbd/nttrans.c:844
#26 0x080cb324 in switch_message (type=0, inbuf=0xb7ae9008 "", 
    outbuf=0xb79e1008 "", size=156, bufsize=131072) at smbd/process.c:968
#27 0x080cb6e1 in process_smb (inbuf=0xb7ae9008 "", outbuf=0xb79e1008 "")
    at smbd/process.c:998
#28 0x080cc3e3 in smbd_process () at smbd/process.c:1558
#29 0x0821e473 in main (argc=2, argv=0xbfed9b34) at smbd/server.c:951
Comment 11 Wilco Baan Hofman 2006-06-14 05:03:59 UTC
Turns out we have the same problem at another customer, attaching another backtrace, in case it's of any use...

This may mean that there's a more serious problem when using the LDAP library. 

I administer 108 linux servers atm, of which 5 in this LDAP setup and about 60 with the MySQL setup, the rest in other setups.

I currently only notice this problem when using the ldap passdb backend. on the other boxes libldap is linked in and thus also libpthread.so.0, but we don't use the ldap library there.

Hope this helps give a bit more insight into the problem.
Comment 12 Wilco Baan Hofman 2006-06-14 05:06:53 UTC
Created attachment 1961 [details]
Backtrace of all smbd processes, at another customer site
Comment 13 Volker Lendecke 2006-06-14 05:38:55 UTC
If this really comes down to LDAP, something inside the chain nss_ldap->openldap->tls has to activate pthreads. Your smbds seem to block in free() that itself is calling pthread stuff. As the debian developers said this is not happening anywhere, we are looking at a miracle. Samba definitely does not call any pthread routines, I just grepped for pthread and found a single include line in ./nsswitch/winbind_nss_hpux.h, hardly used on your system :-)

Please don't get me wrong, I'd _really_ love to see this fixed, but this would involve walking through a good piece of source code of your system that has nothing to do with Samba.

Closing this as invalid, as I don't see how we inside the Samba Team can fix this.

Sorry,

Volker
Comment 14 Wilco Baan Hofman 2006-06-15 07:16:47 UTC
Strangely, after removing pthreads from libldap2 and libnss-ldap and installing nscd, it gets stuck in a loop here:

#0  0xb7e47dfe in write () from /lib/libc.so.6
#1  0x00000009 in ?? ()
#2  0x08334a70 in ?? ()
#3  0x081b2158 in x_fflush (f=0x8334a70) at lib/xfile.c:239
#4  0x081b2237 in x_fwrite (p=0x832ec20, size=<error type>, nmemb=<error type>, f=0x8334a70) at lib/xfile.c:168
#5  0x081b2302 in x_vfprintf (f=0x8334a70, format=0x8270cfc "[%s, %d%s] %s:%s(%d)\n", ap=0xbf870fc4 " K*\b")
    at lib/xfile.c:203
#6  0x081a9a5d in Debug1 (format_str=0x8270cfc "[%s, %d%s] %s:%s(%d)\n") at lib/debug.c:789
#7  0x081a94a9 in dbghdr (level=0, file=0x8270f80 "lib/fault.c", func=0x8270f73 "fault_report", line=36)
    at lib/debug.c:947
#8  0x081aa34e in sig_fault (sig=6) at lib/fault.c:36
#9  <signal handler called>
#10 0xb7dbf6f1 in kill () from /lib/libc.so.6
#11 0xb7dbf455 in raise () from /lib/libc.so.6
#12 0xb7dc09d8 in abort () from /lib/libc.so.6
#13 0xb7df511f in __fsetlocking () from /lib/libc.so.6
#14 0xb7dfafb2 in malloc_usable_size () from /lib/libc.so.6
#15 0xb7dfb978 in free () from /lib/libc.so.6
#16 0x0817f5b1 in get_share_modes (conn=0x8333ba8, dev=2050, inode=3130783, pp_shares=0xbf871f64)
    at locking/locking.c:523
#17 0x080b7c01 in open_mode_check (conn=0x8333ba8, fname=0xbf8724d0 "Ambrasoft/RealSpeak/lhsp01.dll", dev=2050,
    inode=3130783, desired_access=<error type>, share_mode=32800, p_flags=0xbf872040,
    p_oplock_request=0xbf87207c, p_all_current_opens_are_level_II=0xbf87203c) at smbd/open.c:611
---Type <return> to continue, or q <return> to quit---
#18 0x080b964f in open_file_shared1 (conn=0x8333ba8, fname=0xbf8724d0 "Ambrasoft/RealSpeak/lhsp01.dll",
    psbuf=0xbf8728d0, desired_access=<error type>, share_mode=32800, ofun=1, new_dos_mode=<error type>,
    oplock_request=3, Access=0xbf872940, paction=0xbf87293c) at smbd/open.c:1181
#19 0x08096fae in reply_ntcreate_and_X (conn=0x8333ba8, inbuf=0xb7759008 "", outbuf=0xb770c008 "", length=152,
    bufsize=131072) at smbd/nttrans.c:844
#20 0x080cb324 in switch_message (type=0, inbuf=0xb7759008 "", outbuf=0xb770c008 "", size=152, bufsize=131072)
    at smbd/process.c:968
#21 0x080cb6e1 in process_smb (inbuf=0xb7759008 "", outbuf=0xb770c008 "") at smbd/process.c:998
#22 0x080cc3e3 in smbd_process () at smbd/process.c:1558
#23 0x0821e473 in main (argc=2, argv=0xbf873e34) at smbd/server.c:951
Comment 15 Volker Lendecke 2006-06-15 07:27:50 UTC
Another quick question: Is the locking.tdb on a reiser filesystem?

Volker
Comment 16 Wilco Baan Hofman 2006-06-15 07:31:39 UTC
No, ext3 filesystem.
Comment 17 Volker Lendecke 2006-06-15 07:37:41 UTC
You say it does loop in that backtrace? So this means it continually spits out debug statements? Or is it fully stuck?

Volker
Comment 18 Wilco Baan Hofman 2006-06-15 07:46:22 UTC
No, write gets called over and over again. (I set the breakpoint to write there and it kept breaking)

Now at least I get some better results, and a log message in my syslog:

Signal 6, this message two times in the log:

smbd[24936]: [2006/06/15 13:34:27, 0] lib/fault.c:fault_report(36)
smbd[24936]:   ===============================================================
smbd[24936]: [2006/06/15 13:34:27, 0] lib/fault.c:fault_report(37)
smbd[24936]:   INTERNAL ERROR: Signal 6 in pid 24936 (3.0.14a-Debian)
smbd[24936]:   Please read the appendix Bugs of the Samba HOWTO collection
smbd[24936]: [2006/06/15 13:34:27, 0] lib/fault.c:fault_report(39)
smbd[24936]:   ===============================================================
smbd[24936]: [2006/06/15 13:34:27, 0] lib/util.c:smb_panic2(1474)
smbd[24936]:   smb_panic: clobber_region() last called from [get_share_modes(501)]
smbd[24936]: [2006/06/15 13:34:27, 0] lib/util.c:smb_panic2(1485)
smbd[24936]:   smb_panic(): calling panic action [/usr/share/samba/panic-action 24936]
smbd[24936]: [2006/06/15 13:34:28, 0] lib/util.c:smb_panic2(1493)
smbd[24936]:   smb_panic(): action returned status 0
smbd[24936]: [2006/06/15 13:34:28, 0] lib/util.c:smb_panic2(1495)
smbd[24936]:   PANIC: internal error
smbd[24936]: [2006/06/15 13:34:28, 0] lib/util.c:smb_panic2(1503)
smbd[24936]:   BACKTRACE: 18 stack frames:
smbd[24936]:    #0 /usr/sbin/smbd(smb_panic2+0x9d) [0x81bc0bd]
smbd[24936]:    #1 /usr/sbin/smbd(smb_panic+0x11) [0x81bc2b1]
smbd[24936]:    #2 /usr/sbin/smbd [0x81aa3dd]
smbd[24936]:    #3 [0xffffe420]
smbd[24936]:    #4 /lib/libc.so.6(abort+0x108) [0xb7dc09d8]
smbd[24936]:    #5 /lib/libc.so.6 [0xb7df511f]
smbd[24936]:    #6 /lib/libc.so.6 [0xb7dfafb2]
smbd[24936]:    #7 /lib/libc.so.6(__libc_free+0x78) [0xb7dfb978]
smbd[24936]:    #8 /usr/sbin/smbd(get_share_modes+0xc1) [0x817f5b1]
smbd[24936]:    #9 /usr/sbin/smbd [0x80b7c01]
smbd[24936]:    #10 /usr/sbin/smbd(open_file_shared1+0x58f) [0x80b964f]
smbd[24936]:    #11 /usr/sbin/smbd(reply_ntcreate_and_X+0xa8e) [0x8096fae]
smbd[24936]:    #12 /usr/sbin/smbd [0x80cb324]
smbd[24936]:    #13 /usr/sbin/smbd(process_smb+0x171) [0x80cb6e1]
smbd[24936]:    #14 /usr/sbin/smbd(smbd_process+0x1a3) [0x80cc3e3]
smbd[24936]:    #15 /usr/sbin/smbd(main+0x7e3) [0x821e473]
smbd[24936]:    #16 /lib/libc.so.6(__libc_start_main+0x9e) [0xb7dab3be]
smbd[24936]:    #17 /usr/sbin/smbd [0x8077991]
smbd[24936]:
Comment 19 Volker Lendecke 2006-06-15 07:51:11 UTC
Hmm. The only real chance I see to further debug this is to compile with debugging symbols (-g) and run smbd under valgrind. I know this will probably not work in production, because valgrind usually slows stuff down by a factor of more than 10, but if you have a reproducable case this might show the real problem.

Volker
Comment 20 Jeremy Allison 2006-06-15 11:45:56 UTC
This line :

#13 0xb7df511f in __fsetlocking () from /lib/libc.so.6

in the backtrace shows something strange. Why is this being called from free() ?

Do you still have pthreads linked in ? If not, looks like memory corruption to me. Run with the glibc MALLOC_CHECK_ turned up high (man malloc for details) and try again.

Jeremy,
Comment 21 Wilco Baan Hofman 2006-06-15 16:07:30 UTC
Yes, this was without pthreads linked in (checked it thoroughly).

I've tried upgrading to 3.0.23rc2 for now, retrying with that one, see what happens.

In the weekend I'll set up another test case see if I can reproduce it and capture what's going on.. give valgrind a go..
Comment 22 Wilco Baan Hofman 2006-06-19 03:09:38 UTC
This weekend I've done some testing.. made the system reproducable. It seemed to crash before the NT Create AndX response (as we can see in the backtrace also). It only happened when starting a particular software package with 8 computers at the same time (as happens in classrooms, number may vary).

We started by making it reproducable with a batch file, waiting for a start.txt file to appear and then start the software package. This crashed samba in 80% of the cases. Tried 3.0.14a, 3.0.22 and 3.0.23rc3 of which 3.0.14a and 3.0.23rc3 with and without pthread library. 

Though you'd like to know that this hang happens with 3.0.14a both with and without pthreads. With 3.0.22 I only tried with pthreads, it also hanged.

With 3.0.23rc3 both with and without pthreads no hangs occur (at least with my testing), so it appears that it is fixed.

Thanks for your efforts.
Comment 23 Volker Lendecke 2006-06-19 03:17:38 UTC
Okay, if you got it reproducable, can you upload a full sniff of it? Does it also happen with debug level 10? If yes, please also upload one. Maybe with the sniff we can get it down to some very few operations.

Volker

P.S: Thank you for persisting.... :-)  As I said, I'd like to nail that one. If it does not happen with 3.0.23, I want to know *why* :-)
Comment 24 Wilco Baan Hofman 2006-06-21 07:18:36 UTC
Created attachment 1973 [details]
Level 10 debug log w/ 3.0.14a and pthread
Comment 25 Wilco Baan Hofman 2006-06-21 07:19:41 UTC
Created attachment 1974 [details]
Backtrace corresponding with the log
Comment 26 Wilco Baan Hofman 2006-06-21 07:20:37 UTC
Created attachment 1975 [details]
libpcap format network dump file (binary)
Comment 27 Wilco Baan Hofman 2006-06-21 07:25:01 UTC
Just uploaded three files:
* Network dump
* Level 10 smbd debug log
* Backtrace corresponding with the dump and log

Note that calls to ambra.bat and start.txt should be ignored, this is my torture batchfile.

The batch-file Ambra.bat:
@echo off
:10
if exist N:\start.txt goto 20
goto 10
:20
S:
cd "\ambrasoft\tafel totaal"
:30
start /w tschlmenu.exe
goto 10

Logs and traces made with:
touch start.txt;(tail -f /var/log/samba/samba.log >log.smbd.hang &);tethereal -F libpcap -i eth1 -w netdump.log port not 22;killall tail;nice -n -20 ./genbacktrace.sh; rm start.txt

With this I tried to make the network dump and logs as relevant as possible.

Good luck!

Wilco