Since we upgraded our PDC from 2.2.x to 3.0.x we have been having problems where samba seems to go to sleep for up to 3 minutes at a time. Since this is the PDC, this has been causing us major headaches, as it prevents any server on the domain from authenticating users. After a bit of process tracing and a lot of educated guesswork, I tracked the sticking point down to the accept() call in the main loop of smbd/server.c:open_sockets_smbd(). The select() returns, indicating that a client is waiting to connect, but by the time it gets to accept() the client has gone away. From my dim and distant memory of Linux socket behaviour, accept() returns -1 immediately in this case. On IRIX, the call blocks until interrupted by a signal (usually SIGCLD.) The solution is to set the socket as non-blocking, so that accept() will return in these cases. That worked well, until I found that IRIX also exhibits slightly dubious behaviour when spawning child descriptors from accept() - they inherit the (non)blocking flag from the parent socket, causing untold hell when the clients actually try transferring data (I know Linux sockets do not do this, and I am not at all convinced it is correct behaviour.) Solution: set the child socket back as blocking, thus restoring normal behaviour from then on. Patch against 3.0.2a to follow - as you can see, it is pretty trivial. I have not wrapped the changes in any OS-dependent defines, since as I see it: - the listener socket should always be set as non-blocking, just in case; - there is no harm in ensuring the child socket is set to block (if it's already that way, we've wasted a couple of syscalls in the socket setup, but otherwise it's a no-op.) Looking back at source for earlier 2.2.x versions, I can see that nothing significant has changed here, so it must actually be something on our network causing this new problem, rather than the Samba upgrade itself - i.e. the problem has actually existed all along, but hasn't caused anyone a problem yet. My guess would be that the culprit is the NetApp filer which was the initial reason for upgrading the PDC to Samba 3 (something to do with Unicode support which the NetApp CIFS implementation requires.)
Created attachment 453 [details] Patch to fix blocking accept() call on IRIX
added me as cc
Applied - good patch ! Thanks, Jeremy.
sorry for the same, cleaning up the database to prevent unecessary reopens of bugs.