Bug 7295 - winbindd doesn't reconnect to it's own domain in the parent
Summary: winbindd doesn't reconnect to it's own domain in the parent
Status: RESOLVED FIXED
Alias: None
Product: Samba 3.5
Classification: Unclassified
Component: Winbind (show other bugs)
Version: unspecified
Hardware: Other Linux
: P3 critical
Target Milestone: ---
Assignee: Karolin Seeger
QA Contact: Samba QA Contact
URL:
Keywords:
: 8441 (view as bug list)
Depends on: 7159
Blocks: 7316
  Show dependency treegraph
 
Reported: 2010-03-25 15:14 UTC by Stefan Metzmacher
Modified: 2011-09-07 11:09 UTC (History)
6 users (show)

See Also:


Attachments
draft patch for master (32.52 KB, patch)
2010-03-25 15:57 UTC, Stefan Metzmacher
no flags Details
Patch for v3-5 (46.48 KB, patch)
2010-03-29 13:19 UTC, Stefan Metzmacher
no flags Details
A standalone tickleack command (8.66 KB, text/x-csrc)
2010-03-29 13:39 UTC, Stefan Metzmacher
no flags Details
A killtcp command based on the tickleack command and iptables (809 bytes, application/octet-stream)
2010-03-29 13:40 UTC, Stefan Metzmacher
no flags Details
new Patch for v3-5 (47.79 KB, patch)
2010-03-29 15:40 UTC, Stefan Metzmacher
metze: review+
Details
Patch for v3-4 (52.04 KB, patch)
2010-04-06 09:02 UTC, Stefan Metzmacher
no flags Details
New Patch for v3-4 (lets an intermediate commit compile) (51.62 KB, patch)
2010-04-07 07:21 UTC, Stefan Metzmacher
jra: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Stefan Metzmacher 2010-03-25 15:14:42 UTC
If winbindd loses its connection to its own domain (from within the parent winbindd) it doesn't reconnect and keeps getting EPIPE for all future
calls.

I have untested fixes here
http://gitweb.samba.org/?p=metze/samba/wip.git;a=shortlog;h=refs/heads/master3-winbind-timeout
Comment 1 Stefan Metzmacher 2010-03-25 15:21:29 UTC
It would be nice if someone could test this, so that we can get it into 3.5.2.

3.4.8 also needs this and 3.3.11 is also affected.
Comment 2 Jeremy Allison 2010-03-25 15:50:57 UTC
Metze, can you attach the specific patchstream to this bug so I know exactly what changes I'm looking at.

Thanks,

Jeremy.
Comment 3 Stefan Metzmacher 2010-03-25 15:57:59 UTC
Created attachment 5543 [details]
draft patch for master
Comment 4 Stefan Metzmacher 2010-03-25 16:00:55 UTC
Please don't push it to master yet...

I think we need to change the 'return 0;' for the
set_timeout cases to something like 'return RPCCLI_DEFAULT_TIMEOUT;'
Comment 5 Stefan Metzmacher 2010-03-29 13:19:44 UTC
Created attachment 5559 [details]
Patch for v3-5

I'll work on a patch for v3-4 in the next days
Comment 6 Stefan Metzmacher 2010-03-29 13:39:55 UTC
Created attachment 5560 [details]
A standalone tickleack command
Comment 7 Stefan Metzmacher 2010-03-29 13:40:46 UTC
Created attachment 5561 [details]
A killtcp command based on the tickleack command and iptables
Comment 8 Stefan Metzmacher 2010-03-29 13:46:30 UTC
I used this find the connection belonging to winbindd:

#> (ps axf ; netstat -atpn )|grep winbindd
26081 pts/13   S+     0:00  |           \_ grep winbindd
18275 ?        Ss     0:00 bin/winbindd -D
18277 ?        S      0:00  \_ bin/winbindd -D
tcp   0     0 172.31.9.1:47058   172.31.9.218:389  ESTABLISHED 18277/winbindd  
tcp   0     0 172.31.9.1:45793   172.31.9.218:445  ESTABLISHED 18277/winbindd

#> ./killtcp 172.31.9.1:45793 172.31.9.218:445
iptables -I OUTPUT -p tcp -s 172.31.9.1 --sport 45793 -d 172.31.9.218 --dport 445 -j REJECT --reject-with tcp-reset
./tickleack 172.31.9.218:445 172.31.9.1:45793
iptables -D OUTPUT -p tcp -s 172.31.9.1 --sport 45793 -d 172.31.9.218 --dport 445 -j REJECT --reject-with tcp-reset
./tickleack 172.31.9.1:45793 172.31.9.218:445

#> (ps axf ; netstat -atpn )|grep winbindd
26081 pts/13   S+     0:00  |           \_ grep winbindd
18275 ?        Ss     0:00 bin/winbindd -D
18277 ?        S      0:00  \_ bin/winbindd -D
tcp   0     0 172.31.9.1:47058   172.31.9.218:389  ESTABLISHED 18277/winbindd  

Comment 9 Stefan Metzmacher 2010-03-29 15:40:45 UTC
Created attachment 5562 [details]
new Patch for v3-5

This contains the old patches + one more additional fix from master.
(4c6cde99c0751a073120d8bc36d40922d8027344)
Comment 10 Jeremy Allison 2010-03-29 15:43:07 UTC
Comment on attachment 5562 [details]
new Patch for v3-5

Tested this code by suspending my Windows DC VM - and ensuring winbindd reconnects correctly.
Comment 11 Stefan Metzmacher 2010-03-29 16:08:11 UTC
Jeremy, please reassign to Karolin when you're happy with it for 3.5.2
Comment 12 Jeremy Allison 2010-03-29 16:13:29 UTC
Just tested Metze's updated patch. Works great. Re-assigning to Karolin for inclusion in 3.5.2.
Jeremy.
Comment 13 Karolin Seeger 2010-03-30 03:28:13 UTC
(In reply to comment #12)
> Just tested Metze's updated patch. Works great. Re-assigning to Karolin for
> inclusion in 3.5.2.
> Jeremy.
> 

Pushed to v3-5-test.
Re-assigning to metze.
Comment 14 Stefan Metzmacher 2010-04-06 08:54:29 UTC
Comment on attachment 5562 [details]
new Patch for v3-5

Make it more explicit that Jeremy was happy with this patch
Comment 15 Stefan Metzmacher 2010-04-06 09:02:26 UTC
Created attachment 5598 [details]
Patch for v3-4
Comment 16 Stefan Metzmacher 2010-04-06 09:12:29 UTC
For v3-4 this depends on the patches from bug 7159.
Comment 17 Stefan Metzmacher 2010-04-07 07:21:25 UTC
Created attachment 5607 [details]
New Patch for v3-4 (lets an intermediate commit compile)
Comment 18 Stefan Metzmacher 2010-04-07 07:22:15 UTC
Comment on attachment 5607 [details]
New Patch for v3-4 (lets an intermediate commit compile)

Sorry, that was a new patch for v3-4
Comment 19 Jeremy Allison 2010-04-12 13:18:38 UTC
Comment on attachment 5607 [details]
New Patch for v3-4 (lets an intermediate commit compile)

Looks good to me - although haven't had time to test. It matches the 3.5.x changes.
Comment 20 Jeremy Allison 2010-04-12 13:19:11 UTC
Re-assigning to Karolin for inclusion in next 3.4.x.
Jeremy.
Comment 21 Karolin Seeger 2010-04-13 13:22:02 UTC
Pushed to v3-4-test.
Closing out bug report.

Thanks!
Comment 22 Stefan Metzmacher 2011-09-07 11:09:56 UTC
*** Bug 8441 has been marked as a duplicate of this bug. ***