winbindd must not flush cache for uninitialized domains. It causes a lot of traffic unnecessary, as each domain tries to contact primary domain to do initialization.
Created attachment 5454 [details] v3-4-test
Created attachment 5455 [details] v3-5-test
Two questions: Why does winbind get SIGHUP so often that it is a problem? Second: Can we change the cache flush so that we only do one traverse instead of one per domain?
(In reply to comment #3) > Two questions: Why does winbind get SIGHUP so often that it is a problem? heave traffic between primary domain and winbindd might causes unresponsive domain controller. It appeared when there were dozens of domains, init_dc_connection() connects to primary domain for each uninitialized domain to get trusted domain information. > Second: Can we change the cache flush so that we only do one traverse instead > of one per domain? I am not sure about this. Maybe it is doable. Delete all "UL/"/"GL/" records in one traverse for each process. Then return when one domain successfully flushes the cache. >
(In reply to comment #4) > heave traffic between primary domain and winbindd might causes unresponsive > domain controller. It appeared when there were dozens of domains, > init_dc_connection() connects to primary domain for each uninitialized domain > to get trusted domain information. Sure, but why is this more than a one-time thing? How can I reproduce this? > I am not sure about this. Maybe it is doable. Delete all "UL/"/"GL/" records in > one traverse for each process. Then return when one domain successfully flushes > the cache. I'm asking because usually there are FAR more entries in winbindd_cache.tdb than in the domain list. We might prepare an array of domains to be referenced in the traverse function so that it needs to do its work only once. Haven't looked at it more closely, it just struck me that we traverse a potentially very large database more than once. Volker
Jim, you did ack the patches. Do you have any opinion to my questions? Thanks, Volker
>> heave traffic between primary domain and winbindd might causes unresponsive >> domain controller. It appeared when there were dozens of domains, >> init_dc_connection() connects to primary domain for each uninitialized domain >> to get trusted domain information. > >Sure, but why is this more than a one-time thing? How can I reproduce this? A one-time thing happening on a large scale on every dhcp client renewal. It's our scripts, but every time a laptop goes home and back in, changes networks, etc...not uncommon in a large customer.
I'll answer the other question later today, meetings atm...
(In reply to comment #5) > (In reply to comment #4) > > > heave traffic between primary domain and winbindd might causes unresponsive > > domain controller. It appeared when there were dozens of domains, > > init_dc_connection() connects to primary domain for each uninitialized domain > > to get trusted domain information. > > Sure, but why is this more than a one-time thing? How can I reproduce this? As Jim explained. :-) > > > I am not sure about this. Maybe it is doable. Delete all "UL/"/"GL/" records in > > one traverse for each process. Then return when one domain successfully flushes > > the cache. > > I'm asking because usually there are FAR more entries in winbindd_cache.tdb > than in the domain list. We might prepare an array of domains to be referenced > in the traverse function so that it needs to do its work only once. Haven't > looked at it more closely, it just struck me that we traverse a potentially > very large database more than once. I'll look at it. It is feasible. > > Volker >
> I'm asking because usually there are FAR more entries in winbindd_cache.tdb > than in the domain list. We might prepare an array of domains to be referenced > in the traverse function so that it needs to do its work only once. Haven't > looked at it more closely, it just struck me that we traverse a potentially > very large database more than once. Actually, upon closer looking, I think we can just do it once anyway. The traverse fn is currently just looking for UL\ and GL\ only. Nothing specific to any domain...
When you want reviewed patches to be applied to the release branches, you should assign the correponding bugs to Karolin. Otherwise she won't notice. But what is the status of this bug? There are two ACKed patches, which have also gone to master, but the discussion does not seem to have come to a final conclusion. Should we reassign to Karolin for inclusion? Or does this need more work?
Jim, I am reassigning the bug to you for a decision what to do with the patches regarding 3.4 and 3.5. Thanks - Michael
As it seems only to be hit by our customers, I'm happy to leave it to the newer releases and not disrupt existing ones.