Bug 3068 - internal error in winbindd
Summary: internal error in winbindd
Status: RESOLVED FIXED
Alias: None
Product: Samba 3.0
Classification: Unclassified
Component: winbind (show other bugs)
Version: 3.0.20a
Hardware: Sparc Solaris
: P3 normal
Target Milestone: none
Assignee: Samba Bugzilla Account
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-09-08 03:01 UTC by Alexander Leidinger
Modified: 2005-10-03 10:33 UTC (History)
0 users

See Also:


Attachments
Possible fix (479 bytes, patch)
2005-10-03 03:23 UTC, Volker Lendecke
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alexander Leidinger 2005-09-08 03:01:50 UTC
Hi,

sometimes we get an internal error from a winbindd child. The parent is still 
running.

The version of the dependencies are:
 - libiconv (1.8)
 - MIT kerberos v5 (1.4.2)
 - openssl (0.9.7g)
 - openldap (2.2.28)
 - samba (3.0.20)

Everything else (sasl for openldap) is the default version as it comes with 
Solaris 10.

Those explicit dependencies where linked statically (we didn't generated 
shared libs for them for some reasons).

The global section of the smb.conf is:
---snip---
[global]
        show add printer wizard = no
        server string = schiller
        workgroup = publications
        encrypt passwords = yes
        load printers = no
        password server = <ip>
        add user script = /usr/sbin/useradd -g smbusers -s /bin/false %u
        local master = no
        dns proxy = no
        realm = publications.win
        log level = 2
        wins server = <ip, same as above>
        log file = /opt/OPsamba/var/log.%m
        security = ads
        preferred master = false
        netbios name = schiller
        domain master = false
        idmap uid = 30000 - 40000
---snip---

log.winbindd:
---snip---
[2005/09/08 11:21:41, 1] nsswitch/winbindd.c:main(935)
  winbindd version 3.0.20 started.
  Copyright The Samba Team 2000-2004
[2005/09/08 11:21:41, 2] param/loadparm.c:do_section(3559)
  Processing section "[alpha]"
[2005/09/08 11:21:41, 2] param/loadparm.c:do_section(3559)
  Processing section "[test0815]"
[2005/09/08 11:21:41, 2] param/loadparm.c:do_section(3559)
  Processing section "[truc]"
[2005/09/08 11:21:41, 2] param/loadparm.c:do_section(3559)
  Processing section "[test]"
[2005/09/08 11:21:41, 2] lib/interface.c:add_interface(81)
  added interface ip=<own ip> bcast=<bcast> nmask=<mask>
[2005/09/08 11:21:41, 2] lib/interface.c:add_interface(81)
  added interface ip=<own ip> bcast=<bcast> nmask=<mask>
[2005/09/08 11:21:41, 0] nsswitch/winbindd_util.c:winbindd_param_init(766)
  winbindd: idmap uid range missing or invalid
[2005/09/08 11:21:41, 0] nsswitch/winbindd_util.c:winbindd_param_init(767)
  winbindd: cannot continue, exiting.
[2005/09/08 11:21:41, 1] nsswitch/winbindd.c:main(968)
  Could not init idmap -- netlogon proxy only
[2005/09/08 11:21:41, 2] lib/tallocmsg.c:register_msg_pool_usage(56)
  Registered MSG_REQ_POOL_USAGE
[2005/09/08 11:21:41, 2] lib/dmallocmsg.c:register_dmalloc_msgs(71)
  Registered MSG_REQ_DMALLOC_MARK and LOG_CHANGED
[2005/09/08 11:21:41, 2] nsswitch/winbindd_util.c:add_trusted_domain(166)
  Added domain PUBLICATIONS PUBLICATIONS.WIN S-1-5-21-117609710-1229272821-
839522115
[2005/09/08 11:21:41, 2] nsswitch/winbindd_util.c:add_trusted_domain(166)
  Added domain BUILTIN  S-1-5-32
[2005/09/08 11:21:41, 2] nsswitch/winbindd_util.c:add_trusted_domain(166)
  Added domain SCHILLER  S-1-5-21-308816121-94223975-3382285697
---snip---

log.wb-PUBLICATIONS contains:
---snip---
[2005/09/08 11:21:41, 0] lib/fault.c:fault_report(37)
  INTERNAL ERROR: Signal 11 in pid 9260 (3.0.20)
  Please read the appendix Bugs of the Samba HOWTO collection
[2005/09/08 11:21:41, 0] lib/fault.c:fault_report(39)
  ===============================================================
[2005/09/08 11:21:41, 0] lib/util.c:smb_panic2(1548)
  PANIC: internal error
[2005/09/08 11:21:42, 2] libsmb/cliconnect.c:cli_session_setup_kerberos(532)
  Doing kerberos session setup
[2005/09/08 11:21:42, 0] lib/fault.c:fault_report(36)
  ===============================================================
[2005/09/08 11:21:42, 0] lib/fault.c:fault_report(37)
  INTERNAL ERROR: Signal 11 in pid 9262 (3.0.20)
  Please read the appendix Bugs of the Samba HOWTO collection
[2005/09/08 11:21:42, 0] lib/fault.c:fault_report(39)
  ===============================================================
[2005/09/08 11:21:42, 0] lib/util.c:smb_panic2(1548)
  PANIC: internal error
---snip---

The gdb backtrace is:
---snip---
(gdb) bt
#0  0xff13d5ec in setitimer () from /lib/libc.so.1
#1  0xff0dd88c in putspent () from /lib/libc.so.1
#2  0xff0bde40 in abort () from /lib/libc.so.1
#3  0x000cac9c in smb_panic2 (why=0x3c1980 "internal error", 
    decrement_pid_count=4506464) at lib/util.c:1614
#4  0x000caae4 in smb_panic (why=0x3c1980 "internal error") at lib/util.c:1500
#5  0x000b7314 in fault_report (sig=11) at lib/fault.c:41
#6  0x000b7378 in sig_fault (sig=11) at lib/fault.c:64
#7  0xff13c534 in __csigsetjmp () from /lib/libc.so.1
#8  0xff1319a0 in call_user_handler () from /lib/libc.so.1
#9  0xff1135ec in _ndoprnt () from /lib/libc.so.1
#10 0xff115c6c in vfprintf () from /lib/libc.so.1
#11 0x000d1010 in talloc_vasprintf (t=0x44c760, fmt=0x3b3688 "%s\\%s\\%s", 
    ap=0xffbfe884) at lib/talloc.c:953
#12 0x000d1078 in talloc_asprintf (t=0x44c760, fmt=0x3b3688 "%s\\%s\\%s")
    at lib/talloc.c:976
#13 0x0007266c in winbindd_dual_list_trusted_domains (domain=0x44dd18, 
    state=0xffbfe990) at nsswitch/winbindd_misc.c:133
#14 0x0007d68c in child_process_request (domain=0x44dd18, state=0xffbfe990)
    at nsswitch/winbindd_dual.c:361
#15 0x0007dbd8 in fork_domain_child (child=0x44e104)
    at nsswitch/winbindd_dual.c:490
#16 0x0007d2b4 in schedule_async_request (child=0x44e104)
    at nsswitch/winbindd_dual.c:198
#17 0x0007d878 in winbind_child_died (pid=9146) at nsswitch/winbindd_dual.c:416
#18 0x00061e08 in process_loop () at nsswitch/winbindd.c:860
#19 0x00062424 in main (argc=4387840, argv=0x42f400)
    at nsswitch/winbindd.c:1032
(gdb) up 11
#11 0x000d1010 in talloc_vasprintf (t=0x44c760, fmt=0x3b3688 "%s\\%s\\%s", 
    ap=0xffbfe894) at lib/talloc.c:953
953		len = vsnprintf(NULL, 0, fmt, ap2);
(gdb) print fmt
$1 = 0x3b3688 "%s\\%s\\%s"
(gdb) print ap2
No symbol "ap2" in current context.
(gdb) list
948		char *ret;
949		va_list ap2;
950		
951		VA_COPY(ap2, ap);
952	
953		len = vsnprintf(NULL, 0, fmt, ap2);
954	
955		ret = _talloc(t, len+1);
956		if (ret) {
957			VA_COPY(ap2, ap);
(gdb) print ap
$2 = 0xffbfe894
(gdb) print *ap
Attempt to dereference a generic pointer.
(gdb) print *(char *)ap
$3 = 0 '\0'
(gdb) up 1
#12 0x000d1078 in talloc_asprintf (t=0x44c760, fmt=0x3b3688 "%s\\%s\\%s")
    at lib/talloc.c:976
976		ret = talloc_vasprintf(t, fmt, ap);
(gdb) list
971	{
972		va_list ap;
973		char *ret;
974	
975		va_start(ap, fmt);
976		ret = talloc_vasprintf(t, fmt, ap);
977		va_end(ap);
978		return ret;
979	}
980	
(gdb) print t
$4 = (const void *) 0x44c760
(gdb) print fmt
$5 = 0x3b3688 "%s\\%s\\%s"
(gdb) up 1  
#13 0x0007266c in winbindd_dual_list_trusted_domains (domain=0x44dd08, 
    state=0xffbfe9a0) at nsswitch/winbindd_misc.c:133
133			extra_data = talloc_asprintf(state->mem_ctx, "%s\\%s\\%
s",
(gdb) list
128							  &alt_names, &sids);
129	
130		extra_data = talloc_strdup(state->mem_ctx, "");
131	
132		if (num_domains > 0)
133			extra_data = talloc_asprintf(state->mem_ctx, "%s\\%s\\%
s",
134						     names[0], alt_names[0],
135						     sid_string_static(&sids
[0]));
136	
137		for (i=1; i<num_domains; i++)
(gdb) print state->mem_ctx
$6 = (TALLOC_CTX *) 0x44c760
(gdb) print names[0]
$7 = 0x44c3a0 "OPOCE.DOM"
(gdb) print alt_names[0]
$8 = 0x0
(gdb) print sids[0]
$9 = {sid_rev_num = 1 '\001', num_auths = 4 '\004', 
  id_auth = "\0\0\0\0\0\005", sub_auths = {21, 1692678069, 581711446, 
    178676651, 0 <repeats 11 times>}}
(gdb) print &sids[0]
$10 = (DOM_SID *) 0x44c270
---snip---

In case you want more debugging output or more information, just ask.

Bye,
Alexander.
Comment 1 Alexander Leidinger 2005-09-15 07:33:02 UTC
Hi,

I've recompiled everything with dynamic libs and without kerberos support in
openssl (samba is still compiled with MIT-kerberos). It still dumps core.

Bye,
Alexander.
Comment 2 Gerald (Jerry) Carter (dead mail address) 2005-09-27 12:14:25 UTC
Alexander.  Please retest the current SAMBA_3_0_RELEASE tree
(svn co svn://svnanon.samba.org/samba/branches/SAMBA_3_0_RELEASE samba-3.0.20a)
This should be fixed now.
Comment 3 Alexander Leidinger 2005-10-03 01:54:09 UTC
Hi,

I've tested with 3.0.20a, but unfortunately the bug isn't fixed.

Here's the part which differs from the backtrace with 3.0.20:
---snip---
#13 0x0004adc4 in winbindd_dual_list_trusted_domains (domain=0x1e2b60,
    state=0xffbfe7f8) at nsswitch/winbindd_misc.c:133
#14 0x00055e38 in child_process_request (domain=0x1e2b60, state=0xffbfe7f8)
    at nsswitch/winbindd_dual.c:353
#15 0x00056398 in fork_domain_child (child=0x1e2f4c)
    at nsswitch/winbindd_dual.c:483
#16 0x00055a60 in schedule_async_request (child=0x1e2f4c)
    at nsswitch/winbindd_dual.c:197
#17 0x00056024 in winbind_child_died (pid=15651)
    at nsswitch/winbindd_dual.c:408
#18 0x0003a504 in process_loop () at nsswitch/winbindd.c:860
#19 0x0003ab20 in main (argc=1867776, argv=0x1c8000)
    at nsswitch/winbindd.c:1032
---snip---

In case you need more debugging output, just ask.

Bye,
Alexander.
Comment 4 Volker Lendecke 2005-10-03 03:23:39 UTC
Created attachment 1467 [details]
Possible fix

Could you try the attached patch?

Thanks,

Volker
Comment 5 Alexander Leidinger 2005-10-03 04:41:30 UTC
Hi,

it seems to fix the problem. No immediate core dump and no core dump each 5
minutes so far (it's running for ~1 hour now). Additionally there's a new
thrusted domain showing up in the logfile.

Thanks,
Alexander.
Comment 6 Jeremy Allison 2005-10-03 10:33:48 UTC
Applied patch to SVN.
Thanks,
Jeremy.