We have a performance test setup which is used to test authentication load on winbindd. Winbindd is configured for a single domain. Login requests (WINBINDD_PAM_AUTH) are pumped to winbindd from client processes. These client processes are relatively short lived in the sense they send few login requests to winbindd wait for some time for the response otherwise timeout. The timeout kept for authentication is 25 secs. Everything works fine if winbindd is responding successfully for the login requests. However, after some days (~12), suddenly, winbindd login requests start failing (with time out). At this time it was observed that the number of open fds of the winbindd process was near to 1024 (lsof of parent winbindd process). Most of the fds belong to client connections to "winbindd_privileged/pipe" (which is what we are using). On a closer analysis, IMO, I think this is what is happening. In normal scenario login client process opens a connection to winbindd Unix domain socket (to winbind_privileged pipe), sends winbindd auth command, waits for the result and then finally closes the fd. At some point many clients open connection to winbindd socket and submit login requests. If for some reason winbindd takes more time to respond, say network problem, clients timeout and close the connection. But at the winbindd server end, the requests are in the queue and server end of the connection is still open. If another set of login requests come from client end these are also queued and open fds at winbindd keep increasing. There is some code winbindd.c/remove_idle_client() in winbindd to handle idle connections. But this code kicks in only after 200 (default) client connections. Also, based on my understanding the cleanup up code removes only one idle connection. In summary, it seems there are two problems. One, winbindd is not detecting the closed connections immediately. Second, idle connection cleanup code only removes one idle connection.
Are you aware of the fixes that went in for bug #9104 - winbindd can mis-identify idle clients - can cause crashes and NDR parsing errors ? Specifically this attachment: https://attachments.samba.org/attachment.cgi?id=7797 Can you apply these to your 3.5.x and re-test ? It might have an effect on the problem. Cheers, Jeremy.
Thanks Jeremy. I will try out with new patch. However, from my understanding of the patch, it avoids wrong identification of idle clients. But in the problem that I mentioned, the idle client is not being removed at all for a long time even though client end of the connection is closed. I was thinking of checking the PID of the client while detecting idle clients.
In case your base system is RHEL: There is a kernel bug in RHEL that would trigger this. If you close one end of a unix domain stream socket (/tmp/.winbindd/pipe), the other end does not become readable. You might want to add a dummy write of one byte to wb_common.c, winbindd_close_sock(). This will at least cure the problem with winbind clients that properly shut down. The problem for crashed processes can only be fixed by RedHat. We had to add a similar workaround to vfs_aio_fork.c.
We are using cent OS based linux with kernel - 2.6.24.7 #1 SMP Mon Aug 20 19:58:17 PDT 2012 i686 i686 i386 GNU/Linux. Also, we are not using the Samba utility to authenticate but created our own client which uses the same winbindd commands. I modified my client to write 1 byte to the winbindd process before closing the unix domain socket. I did not see much difference in my test. The problem aggravates further. Either due to this (pile up of fds) or due to some other issue winbindd starts consuming high CPU (90%) and above. I searched for other high CPU issues and found following: https://bugzilla.samba.org/show_bug.cgi?id=9000. https://bugzilla.samba.org/show_bug.cgi?id=8409. But I am not sure above address high CPU due to high number of client connections being piled up (I will b applying above patches though). I am kind of proposing a solution which tries to check the client pid (state->pid) and cleans up a bunch of client connections. I have attached a test patch of the changes in "winbindd.c". If method is fine then I will go ahead with an actual patch. By the way, I tested these changes and it works; cleans up dead client connections.
Created attachment 7857 [details] Test code patch for cleanup of dead connections.
So this patch certainly covers the case where the client process is dead. But what it doesn't explain is why, when the client process is dead and the client end of the pipe is closed, the server end doesn't trigger with a POLLIN signal and cause winbindd to close the connection down by normal means ? Is there a chance you can investigate the server side of this as you control your own client library ? Can you trace what happens server-side when your client library closes its connection ? Jeremy.
Ping. Were you able to look into the issue I requested ? Jeremy.
Created attachment 7902 [details] starce of parent winbindd during the issue.
Sorry for the delay. I was trying to reproduce the issue on a standalone samba installation. While doing so I ran into many issues while installing samba from the git base.That is why it was delayed. I think the issue can be reproduced in standalone samba by modifying smb.conf and making DC unreachable after winbindd is started. In smb.conf we need to modify "winbind max clients" parameter and set it to low value like 5. Along with above change it can be quickly reproduced if we modify wbinfo binary to timeout much earlier. I am trying to do the same with Samba-3.5.14 (which we use) and Samba master (source3). However, on the setup where issue was observed, I collected the strace of the parent winbindd. In this case, I do not see any action at the parent winbindd when client closes it's connection and exits. There is one difference in our setup compared to standard usage of winbindd. In our case we are running winbindd in a jail environment. Either due to the above difference or due to the Kernel issue mentioned by Volker in comment#3 is causing winbindd not to see any event upon client close of the socket. I have attached the strace output as a file (comment#8). I will also consult with Matthieu Patou (our in house Samba team member) on this issue as well.
I have completed my testing on standalone Samba-3.5.14 and Samba master. It is reproducible in both. Reproducible steps are simple. In winbindd/winbindd_dual_pam.c, in function fork_domain_child(), just before writing the result to the parent winbindd introduce some delay (say 3 min). Reduce the max number of clients to say 20 (winbind max clients = 20). And then try to authenticate using wbinfo in the background. After a while you see that even though wbinfo clients have timed out you still see the open fds at the parent winbindd end. In fact, child winbindd continues to process the requests for clients who have exited. I am also proposing the following patch. Please check.
Created attachment 7930 [details] One possible fix for the issue observed. One concern about the proposed solution is the assumption that kill(pid, 0) behaves same on *nixes.
(In reply to comment #10) > I have completed my testing on standalone Samba-3.5.14 and Samba master. It is > reproducible in both. > > Reproducible steps are simple. > > In winbindd/winbindd_dual_pam.c, in function fork_domain_child(), just before > writing the result to the parent winbindd introduce some delay (say 3 min). > Reduce the max number of clients to say 20 (winbind max clients = 20). And then > try to authenticate using wbinfo in the background. > After a while you see that even though wbinfo clients have timed out you still > see the open fds at the parent winbindd end. > In fact, child winbindd continues to process the requests for clients who have > exited. > > I am also proposing the following patch. Please check. The behaviour you see is intentional. While a request is being processed, we do not look at the client socket. We will when it is finished. Do the sockets still exist when the three minutes have passed?
I'm sorry but that is simply the wrong fix. The server *must* notice dead clients via the poll() call otherwise things will break down somewhere else. We absolutely depend on this behavior working correctly to function at all. That's not to say there isn't a bug in our handling of it, but this patch just hides a much more serious underlying issue, either in Samba or in the kernel you're using. When you say: "I have completed my testing on standalone Samba-3.5.14 and Samba master. It is reproducible in both." on what platforms are you reproducing this ? Looking at your strace attachment I see no poll() syscall at all, I merely see a select(). Whilst this should also work it means that you're not building winbindd correctly in some way. All builds on modern platforms should be using poll() in preference to select(). Jeremy.
(In reply to comment #13) > I'm sorry but that is simply the wrong fix. The server *must* notice dead > clients via the poll() call otherwise things will break down somewhere else. ... and according to my tests it does. I put in the sleep() in a winbind child, I ran wbinfo over the client limit, I killed some of the wbinfo commands hanging. After all sleeps were done, winbind did close all sockets just fine. We need more information about your exact problem. Volker
Jeremy recent test I did was on a stand alone Ubuntu VM image. ------------ Linux test-VirtualBox 3.2.0-29-generic-pae #46-Ubuntu SMP Fri Jul 27 17:25:43 UTC 2012 i686 i686 i386 GNU/Linux ----------- And in these tests I see that winbindd is using poll() and not select(). The strace I attached earlier that you looked into was on the setup/device that we use in our production. I will attach starce of the recent run and any other tests logs by tomorrow. The problem that I am facing is when winbindd gets loaded/or number of requests queued are many, the close and exit from the client side are not observed/handled by the parent winbindd. Even though client closes and exits, winbindd continues to handle those requests in the back end.
If you see winbind using select, something is severely broken. Either the compile did not detect you have poll, or you have a very old version. We should never ever use select in the winbind parent.
Volker, Reply to comment#12. Correct me if I am wrong. I believe the client winbindd processes the queued requests one by one from request list. While doing so what I see is that it does not bother whether the particular client which is responsible for the request is still active or not. The child winbindd after processing the request hands over the response to the parent which then tries to write to the client and then notices the broken pipe. If this is the case then, as I observed, it can lead to some kind of denial of service for the new client requests. I introduced 3 min delay for every client request. Yes, after handling of one request (3 min delay) sockets still exists. Of these sockets some are of the clients which are active and some which are not active. Reply to comment#14: The problem is that winbindd closes those connections after all requests are handled even though the clients which requested them exited much before. Imagine a scenario (the scenario that we encountered) where login requests are steadily coming (5 to 10 logins per second). These client's timeout is in the range of 30 seconds. And suppose because of some network problem for some DC winbindd is taking more time. Then after 30 secs clients timeout but these requests are piled up at winbindd. And the child winbindd keeps processing these requests whose other end is closed/exited. I am looking for a solution to this problem from these angles. - Before we pass the requests from the parent winbindd to the client winbindd identify whether requested client is still alive or not. If not then do not process it. - Or periodically check and remove the client requests. - Or in the existing code of removal of one client when the number of clients exceed max, enhance to it remove some defined number of clients (may be a smb.conf parameter?)
Ok, now I understand what you mean. Right now this is not really covered in winbind, and the async code that makes up the main winbindd is not really prepared for it. It is certainly possible, but it would be I think a few weeks of coding. We need to go through the async requests that really bite, and properly do cancel routines for them. The main winbind would cancel requests on the fly, but as they might be part-way through to the winbind helper child, this can only be done carefully and before it started its way to the child.
> > - Before we pass the requests from the parent winbindd to the client winbindd > identify whether requested client is still alive or not. If not then do not > process it. That's more or less what Volker proposed, after talking offline with Volker it seems that the best way is to remove the entry from the queue if the client connection is closed and if the child is not processing the request. It would save some memory as we would remove the connection as soon as it's useless instead of just before processing it. But still both solution are pretty similar in the complexity. Another way to acheive the same result is to throttle the clients when you start to see timeout. As you might have noticed on samba-technical there is discussion to allow more than one netlogon connection per machine to a given DC. It means that in short future we will/can have more than 1 winbindd child talking to the same DC. Doing so won't directly remove your problem but it might raise the bar sufficiently high that you don't see the problem it's more a "putting the dirt under the carpet solution" but it might be easier to implement given our constraints.
Pardon my stupid question but as winbindd is ignoring sigpipe the only way for winbindd to become aware that the client has quit is to try to read or write from the client's FD and get errno=EPIPE no or did I miss something ?
Yes, once the client has quite the read end of the pipe will become read for reading in the poll() call and we'll wake up, read from it, get -1, EPIPE and then close the fd. Jeremy.
(In reply to comment #21) > Yes, once the client has quite the read end of the pipe will become read for > reading in the poll() call and we'll wake up, read from it, get -1, EPIPE and > then close the fd. So we don't have an easy way to even cancel the request when it's in the requests queue (that is to say that the request has been read from the client and we are waiting for it to be processed by the winbindd child) as we won't read or write on the client's FD before the answer is coming back from the child winbind.
No, I think we will read() on the client fd from the master winbindd even if we're currently processing a request. If the client closes the fd, we should then wake up and close our own pipe to the winbindd child - however if that child is waiting on a request to/from the DC, then I don't think we artificially kill it. Maybe we should. Jeremy.
(In reply to comment #23) > No, I think we will read() on the client fd from the master winbindd even if > we're currently processing a request. If the client closes the fd, The question is how to detect that the client closes the fd as we ignore sigpipe, of course we can periodically read from the queued request but I don't think that's super efficient nor elegant. > we should > then wake up and close our own pipe to the winbindd child - however if that > child is waiting on a request to/from the DC, then I don't think we > artificially kill it. Maybe we should. Well I think that the problem are the queued requests that can be hundreds not the one or two currently processed.
(In reply to comment #22) > (In reply to comment #21) > > Yes, once the client has quite the read end of the pipe will become read for > > reading in the poll() call and we'll wake up, read from it, get -1, EPIPE and > > then close the fd. > > So we don't have an easy way to even cancel the request when it's in the > requests queue (that is to say that the request has been read from the client > and we are waiting for it to be processed by the winbindd child) as we won't > read or write on the client's FD before the answer is coming back from the > child winbind. We could add that code easily. The tricky piece is to make sure that the tevent based async engine to handle those requests is properly cancellable everywhere.
(In reply to comment #23) > No, I think we will read() on the client fd from the master winbindd even if > we're currently processing a request. No, I do not think we do that. > If the client closes the fd, we should > then wake up and close our own pipe to the winbindd child - however if that > child is waiting on a request to/from the DC, then I don't think we > artificially kill it. Maybe we should. Unfortunately it's a bit more difficult than that. There is no 1:1 correlation between client and child requests anymore. getgrnam for example consists of potentially hundreds of child requests for large groups. Killing a child is not an option I think. The better approach would be to go through the parent's async engine and if a pending request is cancelled, make sure that we properly deal with the response from the child when it comes back even if the requestor has gone. Volker
I had one deeper look at this issue lately, so what we see is that the parent winbindd has for instance 400 fd, and if we trust the output of ls -l /proc/<pid_of_parent>/fd then there very old fd (1, 2 or more hours) that corresponds to clients that are long gone. The child is quite busy (~90% CPU) but still it process requests. For some reasons that I don't understand it seems that we don't detect the closure of the socket it might be that when the client close the socket we don't pay attention to the closure because at that moment the fd of the client is not anymore in the event loop of the main winbindd process as we are waiting for the child process to process the request.
Once the client closes its end of the socket, the server should become readable in the master winbindd - *even if we don't currently have an event attached to the socket*. I actually just tested this by adding a custom test to tevent that did the following: create a socketpair. Add a read event to one, write 1 byte to the other. loop waiting for an event: -> got it. (in the handler) read the byte. delete the read event. close the write end of the socketpair. Add a read|write event to the read end of the socketpair. loop waiting for an event: -> got it. (in the handler) write the byte to the read end of the socketpair. delete the read|write event. Add a new read event. loop waiting for an event: -> got it. So even if there is no current active event on the server side of the socketpair, as soon as we try and either write a reply, or read another request, we will get notified and should error out. I understand this is frustrating, but the event subsystem in Samba is really well tested and works as far as I can test and tell. We need to see a log catching the event where the client closes its end and the server doesn't wake to be able to debug/fix this. Jeremy.
I'm on it ... it takes time
Jeremy: do the fixes from bug #3204 help for this also?
(In reply to comment #30) > Jeremy: do the fixes from bug #3204 help for this also? Yes, that fix should also address this issue as it will reap connections after 60 seconds of inactivity.
great, then let's close it as dup ... *** This bug has been marked as a duplicate of bug 3204 ***