Bug 8387 - pick ldap handling patches to 3.6
pick ldap handling patches to 3.6
Status: NEW
Product: Samba 3.6
Classification: Unclassified
Component: Domain Control
3.6.0
All All
: P5 normal
: ---
Assigned To: Jeremy Allison
Samba QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-08-19 08:34 UTC by Björn Jacke
Modified: 2011-09-21 18:43 UTC (History)
0 users

See Also:
jra: review-


Attachments
git-am fix for 3.6.1 (45.33 KB, patch)
2011-08-20 20:50 UTC, Jeremy Allison
bjacke: review+
jra: review? (metze)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Björn Jacke 2011-08-19 08:34:13 UTC
the two ldap handling patches:

dcb5720ad01fabac300da478368c140e2b6313d2
(s3/ldap: don't continue if we couldn't get the domain info on startup)

fd06cf379ded801adf830499c24875a7c60280be
(s3/ldap: delay the ldap search alarm termination a bit)

should be picked to 3.6.
Comment 1 Jeremy Allison 2011-08-19 23:04:34 UTC
I'm ok with patch dcb5720ad01fabac300da478368c140e2b6313d2.

Patch fd06cf379ded801adf830499c24875a7c60280be doesn't make sense to me.

Inside smbldap_search_ext() we setup an alarm timeout for the local process using:

        int             attempts = 0;
        time_t          endtime = time_mono(NULL)+lp_ldap_timeout();

        got_alarm = 0;
        CatchSignal(SIGALRM, gotalarm_sig);
        alarm_timer = lp_ldap_timeout();
        if (alarm_timer > 0) {
                alarm_timer += 2;
        }
        alarm(alarm_timer);

Then do the LDAP call using:

        while (another_ldap_try(ldap_state, &rc, &attempts, endtime)) {
               .... do stuff....
        }

Then terminate the alarm using:

        /* Teardown timeout. */
        CatchSignal(SIGALRM, SIG_IGN);
        alarm(0);

However - look inside the function another_ldap_try(). Note that inside smbldap_search_ext() we *always* call this initially with attempts == 0. We have:

        time_t now = time_mono(NULL);

        if (*attempts == 0) {
                got_alarm = False;
                old_handler = CatchSignal(SIGALRM, gotalarm_sig);
                alarm(endtime - now);

                if (ldap_state->pid != sys_getpid())
                        smbldap_close(ldap_state);
        }

This means we immediately overwrite the alarm we carefully set of "alarm_timer += 2" (which is equivalent to lp_ldap_timeout() + 2) seconds with a new alarm of "endtime - now" seconds. endtime is calculated inside smbldap_search_ext() to be time_mono(NULL)+lp_ldap_timeout(); Calls to alarm() don't stack - the next call overwrites the previous alarm, so the only one that is ever used is 

So I don't see how the alarm of "lp_ldap_timeout() + 2" seconds in this patch is ever used - it's immediately set to "endtime - now" == "time_mono(NULL)+lp_ldap_timeout() -  time_mono(NULL)" == lp_ldap_timeout()

All other calls to :

        while (another_ldap_try(ldap_state, &rc, &attempts, endtime)) {
               .... do stuff....
        }

inside the file source3/lib/smbldap.c omit this setup call to alarm(alarm_timer), leaving it to the code inside another_ldap_try(). I'm going to change this code inside master, and add an updated patch for 3.6.1 to this bug report for Bjorn to evaluate.

Jeremy.
Comment 2 Jeremy Allison 2011-08-19 23:12:50 UTC
Also, calling another_ldap_try() has a side effect in that it leaves a dangling alarm pending, which isn't always canceled when an LDAP operation called inside an:

        while (another_ldap_try(ldap_state, &rc, &attempts, endtime)) {
               .... do stuff....
        }

loop succeeds. The function can exit with a alarm still active - all functions using the code:

        while (another_ldap_try(ldap_state, &rc, &attempts, endtime)) {
               .... do stuff....
        }

should terminate the alarm on exit. I'll fix this in master.

Jeremy.
Comment 3 Jeremy Allison 2011-08-20 20:50:17 UTC
Created attachment 6793 [details]
git-am fix for 3.6.1

Bjorn - here is a patchset containing the complete set of changes (yours and mine) that have gone into master, back-ported for 3.6.1. Please test against your LDAP setup and let me know if this works for you.

Thanks,

Jeremy.
Comment 4 Karolin Seeger 2011-08-23 18:26:24 UTC
Re-assigning to Björn.
Comment 5 Björn Jacke 2011-08-26 12:43:26 UTC
Comment on attachment 6793 [details]
git-am fix for 3.6.1

it works for me but it's hard to test corner and failure cases. The changes are so huge, that I'd propose to have another developer review it before getting this to 3.6. Volker, can you have another look at this?
Comment 6 Volker Lendecke 2011-08-26 12:50:51 UTC
Putting this on my long todo list. Those changes are really intrusive. I am not certain that I like that for a stable release stream. I might review it later when I find time.
Comment 7 Jeremy Allison 2011-08-26 16:12:57 UTC
The total effect looks intrusive, but if you look carefully it's broken into individual micro-commits that are easily understood.

They fix a long-standing problem we have had with leaving dangling alarm signals (which we *probably* ignore, but who can tell, right ? :-).

Jeremy.
Comment 8 Volker Lendecke 2011-08-30 19:27:51 UTC
Comment on attachment 6793 [details]
git-am fix for 3.6.1

Not giving an ack for this one. It contains far too many not strictly required refactorings for my taste. Also not blocking it, please find someone else to ack it.
Comment 9 Björn Jacke 2011-09-21 08:56:17 UTC
I though it would be a good idea to tripple check this, maybe this isn't needed. Jeremy I leave the decision up to you, whether or not we should wait for another review for this.
Comment 10 Jeremy Allison 2011-09-21 18:43:00 UTC
Comment on attachment 6793 [details]
git-am fix for 3.6.1

Metze - please review for 3.6.1.
Thanks !
Jeremy.