Using samba 3.0.20b as a domain member file server. Server is a Dell PowerEdge 700, 2.4 GHz P4 w/ HT, 1GB RAM. Kernel version is kernel-2.6.12-1.1381_FC3, running on Fedora Core 3. We have been upgrading since samba-3.0.14a to where we are now.
Around 50 users connect to this server daily and it serves as a file server and postgres database server. We are sharing MS access frontends that connect to the postgres db. It also shares out various excel, word, and office files. Our Peach Tree accounting software is also stored on this server.
The problem we are seeing happen over and over again is smbd processes will grow and grow until the samba service becomes unavailable and stops responding. All other services on the server still function however. It is only samba that is crashing. Restarting the samba service does not work and we end up rebooting the server to get samba working again.
I wish I could say this happened at regular intervals but that is not the case. I've seen it crash and then run for months on end, and I've also seen it crash, and then crash right away the next day. Sometimes it lasts for 2weeks, sometimes it makes it a month. This has been happening for the last year now. We have tried several parameter changes such as Kernel Oplocks = off, machine password timeout = 0, and setting deadtime = 15. None of which helped. Blow is an error I finaly captured, but if it is relevent to what is going on I don't know. Since I don't know when this is going to happen next I will try and update with an strace next time it crashes. Please advise.
Error from log level = 3 on 3/20/06 (but not the day the server crashed, that was on 03/22/06):
[2006/03/20 15:14:05, 0] lib/util.c:smb_panic2(1548)
PANIC: internal error
[2006/03/20 15:14:05, 0] lib/util.c:smb_panic2(1556)
BACKTRACE: 22 stack frames:
#0 smbd(smb_panic2+0x8a) [0xb7e4fe03]
#1 smbd(smb_panic+0x19) [0xb7e50037]
#2 smbd [0xb7e3bef1]
#4 smbd(cli_start_connection+0x37e) [0xb7d32427]
#5 smbd(cli_full_connection+0x6a) [0xb7d32573]
#6 smbd(enumerate_domain_trusts+0x145) [0xb7e9a45a]
#7 smbd(update_trustdom_cache+0xdd) [0xb7e99f3b]
#8 smbd(is_trusted_domain+0x65) [0xb7e94519]
#9 smbd(make_user_info_map+0x163) [0xb7e94761]
#10 smbd [0xb7e95367]
#11 smbd [0xb7d5870f]
#12 smbd(ntlmssp_update+0x143) [0xb7d57c41]
#13 smbd(auth_ntlmssp_update+0x44) [0xb7e95726]
#14 smbd [0xb7cefaba]
#15 smbd(reply_sesssetup_and_X+0x4f1) [0xb7cf1069]
#16 smbd [0xb7d1cfa3]
#17 smbd(process_smb+0x19b) [0xb7d1d3c8]
#18 smbd(smbd_process+0x13a) [0xb7d1e26d]
#19 smbd(main+0x91e) [0xb7ed8455]
#20 /lib/tls/libc.so.6(__libc_start_main+0xd3) [0xb78b1e23]
#21 smbd [0xb7cb4e41]
*** This bug has been marked as a duplicate of 3636 ***