If winbindd loses its connection to its own domain (from within the parent winbindd) it doesn't reconnect and keeps getting EPIPE for all future calls. I have untested fixes here http://gitweb.samba.org/?p=metze/samba/wip.git;a=shortlog;h=refs/heads/master3-winbind-timeout
It would be nice if someone could test this, so that we can get it into 3.5.2. 3.4.8 also needs this and 3.3.11 is also affected.
Metze, can you attach the specific patchstream to this bug so I know exactly what changes I'm looking at. Thanks, Jeremy.
Created attachment 5543 [details] draft patch for master
Please don't push it to master yet... I think we need to change the 'return 0;' for the set_timeout cases to something like 'return RPCCLI_DEFAULT_TIMEOUT;'
Created attachment 5559 [details] Patch for v3-5 I'll work on a patch for v3-4 in the next days
Created attachment 5560 [details] A standalone tickleack command
Created attachment 5561 [details] A killtcp command based on the tickleack command and iptables
I used this find the connection belonging to winbindd: #> (ps axf ; netstat -atpn )|grep winbindd 26081 pts/13 S+ 0:00 | \_ grep winbindd 18275 ? Ss 0:00 bin/winbindd -D 18277 ? S 0:00 \_ bin/winbindd -D tcp 0 0 172.31.9.1:47058 172.31.9.218:389 ESTABLISHED 18277/winbindd tcp 0 0 172.31.9.1:45793 172.31.9.218:445 ESTABLISHED 18277/winbindd #> ./killtcp 172.31.9.1:45793 172.31.9.218:445 iptables -I OUTPUT -p tcp -s 172.31.9.1 --sport 45793 -d 172.31.9.218 --dport 445 -j REJECT --reject-with tcp-reset ./tickleack 172.31.9.218:445 172.31.9.1:45793 iptables -D OUTPUT -p tcp -s 172.31.9.1 --sport 45793 -d 172.31.9.218 --dport 445 -j REJECT --reject-with tcp-reset ./tickleack 172.31.9.1:45793 172.31.9.218:445 #> (ps axf ; netstat -atpn )|grep winbindd 26081 pts/13 S+ 0:00 | \_ grep winbindd 18275 ? Ss 0:00 bin/winbindd -D 18277 ? S 0:00 \_ bin/winbindd -D tcp 0 0 172.31.9.1:47058 172.31.9.218:389 ESTABLISHED 18277/winbindd
Created attachment 5562 [details] new Patch for v3-5 This contains the old patches + one more additional fix from master. (4c6cde99c0751a073120d8bc36d40922d8027344)
Comment on attachment 5562 [details] new Patch for v3-5 Tested this code by suspending my Windows DC VM - and ensuring winbindd reconnects correctly.
Jeremy, please reassign to Karolin when you're happy with it for 3.5.2
Just tested Metze's updated patch. Works great. Re-assigning to Karolin for inclusion in 3.5.2. Jeremy.
(In reply to comment #12) > Just tested Metze's updated patch. Works great. Re-assigning to Karolin for > inclusion in 3.5.2. > Jeremy. > Pushed to v3-5-test. Re-assigning to metze.
Comment on attachment 5562 [details] new Patch for v3-5 Make it more explicit that Jeremy was happy with this patch
Created attachment 5598 [details] Patch for v3-4
For v3-4 this depends on the patches from bug 7159.
Created attachment 5607 [details] New Patch for v3-4 (lets an intermediate commit compile)
Comment on attachment 5607 [details] New Patch for v3-4 (lets an intermediate commit compile) Sorry, that was a new patch for v3-4
Comment on attachment 5607 [details] New Patch for v3-4 (lets an intermediate commit compile) Looks good to me - although haven't had time to test. It matches the 3.5.x changes.
Re-assigning to Karolin for inclusion in next 3.4.x. Jeremy.
Pushed to v3-4-test. Closing out bug report. Thanks!
*** Bug 8441 has been marked as a duplicate of this bug. ***