Bug 5390 - Samba crashes when uploading on gigabit ethernet
Summary: Samba crashes when uploading on gigabit ethernet
Status: NEW
Alias: None
Product: Samba 3.0
Classification: Unclassified
Component: File Services (show other bugs)
Version: 3.0.25
Hardware: x64 Linux
: P3 critical
Target Milestone: none
Assignee: Samba Bugzilla Account
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-04-11 14:21 UTC by LIO
Modified: 2009-12-18 10:39 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description LIO 2008-04-11 14:21:21 UTC
There is a problem that occurs when I upload a big amount of data from Windows 2k3 server to Samba server. After about a hour of uploading samba crashes with "setresuid failed with EAGAIN. uid(501) might be over its NPROC limit" error.
Win2k3 server failes to copy with "no disk space" error.
After I can restart coping and then this error occurs again.

$ulimit -u 
15351

Any changes in /etc/security/limits.conf have no result.
I even try to set 
*              hard    nproc   102400
and there was no result.

Ethernet: gigabit cooper.
Comment 1 Tim Mertens 2009-12-18 10:39:34 UTC
We also are seeing (seemingly) this same issue on a RHEL 5 server with Samba 3.0.33.  Here is a longer trace of the log when the error occurs.  After it occurs once, the error keeps occurring with new smbd processes:

Dec 17 18:09:33 hdxchangeserver smbd[13057]: [2009/12/17 18:09:33, 0] lib/util_sec.c:set_effective_uid(205) 
Dec 17 18:09:33 hdxchangeserver smbd[13057]:   setresuid failed with EAGAIN. uid(10006) might be over its NPROC limit 
Dec 17 18:09:33 hdxchangeserver smbd[13057]: [2009/12/17 18:09:33, 0] lib/util_sec.c:assert_uid(101) 
Dec 17 18:09:33 hdxchangeserver smbd[13057]:   Failed to set uid privileges to (-1,10006) now set to (0,0) 
Dec 17 18:09:33 hdxchangeserver smbd[13057]: [2009/12/17 18:09:33, 0] lib/util.c:smb_panic(1655) 
Dec 17 18:09:33 hdxchangeserver smbd[13057]:   PANIC (pid 13057): failed to set uid 
Dec 17 18:09:33 hdxchangeserver smbd[13057]:    
Dec 17 18:09:33 hdxchangeserver smbd[13057]: [2009/12/17 18:09:33, 0] lib/util.c:log_stack_trace(1759) 
Dec 17 18:09:33 hdxchangeserver smbd[13057]:   BACKTRACE: 11 stack frames: 
Dec 17 18:09:33 hdxchangeserver smbd[13057]:    #0 smbd(log_stack_trace+0x2d) [0xb9649d] 
Dec 17 18:09:33 hdxchangeserver smbd[13057]:    #1 smbd(smb_panic+0x5d) [0xb965cd] 
Dec 17 18:09:33 hdxchangeserver smbd[13057]:    #2 smbd [0xb9c9be] 
Dec 17 18:09:33 hdxchangeserver smbd[13057]:    #3 smbd [0xa06bbc] 
Dec 17 18:09:33 hdxchangeserver smbd[13057]:    #4 smbd(set_sec_ctx+0x112) [0xa06f62] 
Dec 17 18:09:33 hdxchangeserver smbd[13057]:    #5 smbd(change_to_user+0x485) [0x9fa9b5] 
Dec 17 18:09:33 hdxchangeserver smbd[13057]:    #6 smbd [0xa15e45] 
Dec 17 18:09:33 hdxchangeserver smbd[13057]:    #7 smbd(smbd_process+0x836) [0xa17236] 
Dec 17 18:09:33 hdxchangeserver smbd[13057]:    #8 smbd(main+0xbdd) [0xc73f7d] 
Dec 17 18:09:33 hdxchangeserver smbd[13057]:    #9 /lib/libc.so.6(__libc_start_main+0xdc) [0x56fdec] 
Dec 17 18:09:33 hdxchangeserver smbd[13057]:    #10 smbd [0x99a7f1] 
Dec 17 18:09:33 hdxchangeserver smbd[13057]: [2009/12/17 18:09:33, 0] lib/fault.c:dump_core(181) 
Dec 17 18:09:33 hdxchangeserver smbd[13057]:   dumping core in /var/log/samba/cores/smbd 
Dec 17 18:09:33 hdxchangeserver smbd[13057]: 
Dec 17 18:33:17 hdxchangeserver smbd[3627]: [2009/12/17 18:33:17, 0] lib/util_sec.c:assert_uid(101) 
Dec 17 18:33:17 hdxchangeserver smbd[3627]:   Failed to set uid privileges to (10004,10004) now set to (0,0) 
Dec 17 18:33:17 hdxchangeserver smbd[3627]: [2009/12/17 18:33:17, 0] lib/util.c:smb_panic(1655) 
Dec 17 18:33:17 hdxchangeserver smbd[3627]:   PANIC (pid 3627): failed to set uid 
Dec 17 18:33:17 hdxchangeserver smbd[3627]:    
Dec 17 18:33:17 hdxchangeserver smbd[3627]: [2009/12/17 18:33:17, 0] lib/util.c:log_stack_trace(1759) 
Dec 17 18:33:17 hdxchangeserver smbd[3627]:   BACKTRACE: 13 stack frames: 
Dec 17 18:33:17 hdxchangeserver smbd[3627]:    #0 smbd(log_stack_trace+0x2d) [0xb9649d] 
Dec 17 18:33:17 hdxchangeserver smbd[3627]:    #1 smbd(smb_panic+0x5d) [0xb965cd] 
Dec 17 18:33:17 hdxchangeserver smbd[3627]:    #2 smbd [0xb9c9be] 
Dec 17 18:33:17 hdxchangeserver smbd[3627]:    #3 smbd [0xb9f92b] 
Dec 17 18:33:17 hdxchangeserver smbd[3627]:    #4 smbd [0xb9fde6] 
Dec 17 18:33:17 hdxchangeserver smbd[3627]:    #5 smbd(message_send_pid+0x34) [0xba0384] 
Dec 17 18:33:17 hdxchangeserver smbd[3627]:    #6 smbd(release_level_2_oplocks_on_change+0x120) [0xbd0c90] 
Dec 17 18:33:17 hdxchangeserver smbd[3627]:    #7 smbd(reply_lockingX+0x223) [0x9d2963] 
Dec 17 18:33:17 hdxchangeserver smbd[3627]:    #8 smbd [0xa161a0] 
Dec 17 18:33:17 hdxchangeserver smbd[3627]:    #9 smbd(smbd_process+0x836) [0xa17236] 
Dec 17 18:33:17 hdxchangeserver smbd[3627]:    #10 smbd(main+0xbdd) [0xc73f7d] 
Dec 17 18:33:17 hdxchangeserver smbd[3627]:    #11 /lib/libc.so.6(__libc_start_main+0xdc) [0x56fdec] 
Dec 17 18:33:17 hdxchangeserver smbd[3627]:    #12 smbd [0x99a7f1] 
Dec 17 18:33:17 hdxchangeserver smbd[3627]: [2009/12/17 18:33:17, 0] lib/fault.c:dump_core(181) 
Dec 17 18:33:17 hdxchangeserver smbd[3627]:   dumping core in /var/log/samba/cores/smbd 
Dec 17 18:33:17 hdxchangeserver smbd[3627]: 
...(Repeats multiple times but with different process and uid numbers, presumably based on what uid is being used by the process)



Is there a patch for this?  This is a major issue for one of our servers, it has been crashing regularly and the Samba crash results in the NFS client on the server (which reads/writes to another server) to crash as well, which requires a hard reset to recover.

There is a lot of information about this issue in a launchpad.net bug as well, it looks like a duplicate of this issue:
https://bugs.launchpad.net/ubuntu/+source/samba/+bug/216358

Reading through that thread provides a good bit of information, maybe it will help in resolving the issue.

I would try to write a patch myself but I am not experienced enough to do so.  Unfortunately the server this issue occurs on needs to be up all the time so we can't easily 'experiment' with changes on it unless an official patch is provided.

Also, is this and bug #3426 duplicates of one another?
https://bugzilla.samba.org/show_bug.cgi?id=3426

It looks similar but I think might not be the same, since Samba does start and run (with intermittent failures) on our server.