Bug 4491 - winbind PANIC for ALLOCATE_GID request on 3.0.25pre2
Summary: winbind PANIC for ALLOCATE_GID request on 3.0.25pre2
Status: RESOLVED FIXED
Alias: None
Product: Samba 3.0
Classification: Unclassified
Component: winbind (show other bugs)
Version: 3.0.25
Hardware: IA64 Windows XP
: P1 regression
Target Milestone: 3.0.25
Assignee: Gerald (Jerry) Carter (dead mail address)
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-04-06 17:23 UTC by Ying Li
Modified: 2007-04-12 12:54 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ying Li 2007-04-06 17:23:23 UTC
We build samba-3.0.25pre2 on HPUX 11.23 IPF, setup Samba as Windows 2000 ADS domain member. wbinfo -u/-g/-n work well. Unfortunately, when a Windows XP SP2 tried to map to a share, Winbind got an internal PANIC error. 

[2007/04/06 14:44:33, 10] nsswitch/winbindd.c:process_request(314)
  process_request: request fn ALLOCATE_GID
[2007/04/06 14:44:33, 3] nsswitch/winbindd_sid.c:winbindd_allocate_gid(488)
  winbindd_allocate_gid state=40112900
[2007/04/06 14:44:33, 3] nsswitch/winbindd_sid.c:winbindd_allocate_gid(497)
  winbindd_allocate_gid preparing sentto_child
[2007/04/06 14:44:33, 3] nsswitch/winbindd_dual.c:sendto_child(334)
  sentto_child begin
[2007/04/06 14:44:33, 0] lib/fault.c:fault_report(41)
  ===============================================================
[2007/04/06 14:44:33, 0] lib/fault.c:fault_report(42)
  INTERNAL ERROR: Signal 11 in pid 20272 (3.0.25pre2)
  Please read the Trouble-Shooting section of the Samba3-HOWTO
[2007/04/06 14:44:33, 0] lib/fault.c:fault_report(44)

  From: http://www.samba.org/samba/docs/Samba3-HOWTO.pdf
[2007/04/06 14:44:33, 0] lib/fault.c:fault_report(45)
  ===============================================================
[2007/04/06 14:44:33, 0] lib/util.c:smb_panic(1619)
  PANIC (pid 20272): internal error
[2007/04/06 14:44:33, 0] lib/fault.c:dump_core(180)
  dumping core in /opt/samba/var/cores/winbindd

It seems the problem broken at the line 125-134 of nsswitch/winbindd_dual.c
   125          state->mem_ctx = mem_ctx;
   126          state->child = child;
   127          state->request = request;
   128          state->response = response;
   129          state->continuation = continuation;
   130          state->private_data = private_data;
   132          DLIST_ADD_END(child->requests, state, struct winbindd_async_request *);
   133
   134          schedule_async_request(child);

Here was backtrace.

Core was generated by `winbindd'.
Program terminated with signal 6, Aborted.
#0  0x60000000c0339410:0 in kill+0x30 () from /usr/lib/hpux32/libc.so.1
(gdb) bt
#0  0x60000000c0339410:0 in kill+0x30 () from /usr/lib/hpux32/libc.so.1
#1  0x60000000c0230430:0 in raise+0x30 () from /usr/lib/hpux32/libc.so.1
#2  0x60000000c02f2370:0 in abort+0x190 () from /usr/lib/hpux32/libc.so.1
warning:
ERROR: Use the "objectdir" command to specify the search
path for objectfile fault.o.
If NOT specified will behave as a non -g compiled binary.

#3  0x436d870:0 in dump_core+0x490 ()
warning:
ERROR: Use the "objectdir" command to specify the search
path for objectfile util.o.
If NOT specified will behave as a non -g compiled binary.

#4  0x43d0510:0 in smb_panic+0x820 ()
#5  0x436c630:0 in fault_report+0x940 ()
#6  0x436c7b0:0 in sig_fault+0x30 ()
#7  <signal handler called>
warning:
ERROR: Use the "objectdir" command to specify the search
path for objectfile winbindd_dual.o.
If NOT specified will behave as a non -g compiled binary.

#8  0x4222eb0:1 in async_request+0x1471 ()
#9  0x4227b30:0 in sendto_child+0x230 ()
warning:
ERROR: Use the "objectdir" command to specify the search
path for objectfile winbindd_sid.o.
If NOT specified will behave as a non -g compiled binary.

#10 0x41e7140:0 in winbindd_allocate_gid+0x5b0 ()
warning:
ERROR: Use the "objectdir" command to specify the search
path for objectfile winbindd.o.
If NOT specified will behave as a non -g compiled binary.

#11 0x4181da0:0 in process_request+0x760 ()
#12 0x4185b10:0 in request_recv+0x280 ()
#13 0x4185320:0 in request_main_recv+0x2c0 ()
#14 0x4182ec0:0 in rw_callback+0x940 ()
#15 0x4187720:0 in process_loop+0xb70 ()
#16 0x4189e30:0 in main+0x19d0 ()

smb.conf
[global]
        workgroup = hpcif44_dom
        server string = Samba Server hpcfs53
        security = ADS
        realm = HPCIF44_DOM.CUP.HP.COM
        password server = hpcif44
        log level = 10
        log file = /var/opt/samba/log.%m
        max log size = 1000
        host msdfs = Yes
        read only = No
        create mask = 0764
        force security mode = 0660
        short preserve case = No
        dos filetime resolution = Yes

        idmap domains = hpcif44_dom
        idmap config hpcif44_dom: default = yes
        idmap config hpcif44_dom: backend = tdb
        idmap config hpcif44_dom: range = 10000-600000

[homes]
        comment = Home Directories
        browseable = No

[tmp]
        path = /tmp

If needed, I can attach a whole log file for winbindd. If I misconfigure something, let me know.

Thanks
Comment 1 Ying Li 2007-04-09 15:01:39 UTC
Today, I found a workaround to resolve the core. The reason was idmap uid/gid range couldn't be picked from "idmap config domain_name:range =low-high" setting when using tdb default backend. We got an error message:
    idmap uid range missing or invalid
    idmap will be unable to map foreign SIDs
If I add idmap uid = 10000-60000 and idmap gid = 10000-60000 into smb.conf, the problem disappears.

It seems to me the idmap config domain_name: range=low-high couldn't work for tdb. We must specify idmap uid/gid range in smb.conf. However the document at http://us3.samba.org/samba/docs/man/manpages-3/idmap_tdb.8.html doesn't mention idmap uid/gid options must be configured in smb.conf for idmap tdb.

I don't know this was a bug, or by design. Anyway, when idmap uid/gid are missed in smb.conf, we should NOT have a core dump.
thanks.
Comment 2 Gerald (Jerry) Carter (dead mail address) 2007-04-09 15:04:59 UTC
Trying removning the space between the : and the "range"

   idmap config hpcif44_dom:range = 10000-600000


Comment 3 Ying Li 2007-04-09 15:13:55 UTC
(In reply to comment #2)
> Trying removning the space between the : and the "range"
>    idmap config hpcif44_dom:range = 10000-600000

Removing the space between the : and the range still won't work. If adding idmap uid/gid =low-high, or adding idmap alloc config: range = low-high, both work well.



Comment 4 Ying Li 2007-04-09 15:24:37 UTC
I got one more question. How we specify different idmap range for two domains.
For example: assume we want to have 10000-600000 (range1) for hpcif44_dom, 600001-900000 (range2) for hpcif20_dom.
        idmap domains = hpcif44_dom, hpcif20_dom

        idmap config hpcif44_dom:default = yes
        idmap config hpcif44_dom:backend = tdb
        idmap config hpcif44_dom:range = 10000-600000

        idmap alloc backend = tdb
        idmap alloc config: range = 10000-600000

        idmap config hpcif20_dom:default = yes
        idmap config hpcif20_dom:backend = tdb
        idmap config hpcif20_dom:range = 600001-900000

If the option "idmap config domain_name:range=low-high" won't work, How do we specify the second range for a trusted domain? idmap uid/gid and idmap alloc config:range seem not being domain specific.

Thanks.
Comment 5 Simo Sorce 2007-04-10 09:39:43 UTC
(In reply to comment #4)
> I got one more question. How we specify different idmap range for two domains.
> For example: assume we want to have 10000-600000 (range1) for hpcif44_dom,
> 600001-900000 (range2) for hpcif20_dom.
>         idmap domains = hpcif44_dom, hpcif20_dom
> 
>         idmap config hpcif44_dom:default = yes
>         idmap config hpcif44_dom:backend = tdb
>         idmap config hpcif44_dom:range = 10000-600000
> 
>         idmap alloc backend = tdb
>         idmap alloc config: range = 10000-600000
> 
>         idmap config hpcif20_dom:default = yes
>         idmap config hpcif20_dom:backend = tdb
>         idmap config hpcif20_dom:range = 600001-900000
> 
> If the option "idmap config domain_name:range=low-high" won't work, How do we
> specify the second range for a trusted domain? idmap uid/gid and idmap alloc
> config:range seem not being domain specific.


The domain spoecific range should be used only as a filter for most modules.
It is useful when using remote backends like LDAP or AD (so that you are sure 2 users can never have the same ID by mistake).

In your case you don't need ranges per domain, as you are allocating IDs.

We made the choice to not allow multiple alloc backends and force every new ID to be allocated form a common pool.

Another backand that benefits directly from domain ranges is of course idmap_rid, but in that case you never allocate new IDs.
Comment 6 Ying Li 2007-04-10 15:01:48 UTC
The problem was we had no SFU installed on ADS and no LDAP server involved. We are expecting to share tdb backends for mutiple domains. But the code disallows doing this with an error as below. 
[2007/04/10 12:34:54, 1] nsswitch/idmap.c:idmap_init(399)
  ERROR: Multiple domains defined as default!
[2007/04/10 12:34:54, 0] nsswitch/idmap.c:idmap_init(656)
  Aborting IDMAP Initialization ...


Secondly, when we setup to use rid backends for two domains like 
        idmap config hpcif44_dom:backend = rid
        idmap config hpcif44_dom:base_rid = 1000
        idmap config hpcif44_dom:range = 10000-600000

        idmap config hpcif20dom:backend = rid
        idmap config hpcif20dom:base_rid = 1000
        idmap config hpcif20dom:range = 600001-900000

Both wbinfo -S domain1_sid and wbinfo -S domain2_sid failed. Even if idmap config has already assigned idmap range, the code still was looking for idmap uid/gid. Look at the code idmap.c:
               if (ab && (ab[0] != '\0')) {
                        alloc_backend = talloc_strdup(idmap_ctx, lp_idmap_alloc_backend());
                        DEBUG(1, ("idmap_init: lp_idmap_alloc_backend\n"));
                } else {
                        alloc_backend = talloc_strdup(idmap_ctx, "tdb");
                        DEBUG(1, ("idmap_init: tdb\n"));
                }
The code always assumes using tdb if idmap alloc backend is not used. This was WHY the idmap config domain_name:range couldn't function. So there were following errors occured.
[2007/04/10 12:37:10, 0] nsswitch/idmap_rid.c:idmap_rid_initialize(47)
  idmap_rid_initialize: begin hpcif44_dom
[2007/04/10 12:37:10, 0] nsswitch/idmap_rid.c:idmap_rid_initialize(70)
  idmap_rid_initialize: range=10000-600000
[2007/04/10 12:37:10, 0] nsswitch/idmap_rid.c:idmap_rid_initialize(83)
  idmap_rid_initialize: END
[2007/04/10 12:37:10, 3] nsswitch/idmap.c:idmap_init(457)
  idmap_init: 55
[2007/04/10 12:37:10, 3] nsswitch/idmap.c:idmap_init(466)
  idmap_init: 66
[2007/04/10 12:37:10, 10] nsswitch/idmap.c:idmap_init(471)
  Domain hpcif44_dom - Backend rid - not default - readonly
[2007/04/10 12:37:10, 3] nsswitch/idmap.c:idmap_init(391)
  idmap_init: 00
[2007/04/10 12:37:10, 3] nsswitch/idmap.c:idmap_init(410)
  idmap_init: 11
[2007/04/10 12:37:10, 3] nsswitch/idmap.c:idmap_init(420)
  idmap_init: parm_backend=rid
[2007/04/10 12:37:10, 3] nsswitch/idmap.c:get_methods(65)
  get_method: method=rid
[2007/04/10 12:37:10, 3] nsswitch/idmap.c:idmap_init(424)
  idmap_init: Preparing smb_probe_module rid
[2007/04/10 12:37:10, 3] nsswitch/idmap.c:idmap_init(432)
  idmap_init: 222
[2007/04/10 12:37:10, 3] nsswitch/idmap.c:idmap_init(439)
  idmap_init: 33
[2007/04/10 12:37:10, 3] nsswitch/idmap.c:idmap_init(448)
  idmap_init: 44
[2007/04/10 12:37:10, 0] nsswitch/idmap_rid.c:idmap_rid_initialize(47)
  idmap_rid_initialize: begin hpcif20dom
[2007/04/10 12:37:10, 0] nsswitch/idmap_rid.c:idmap_rid_initialize(70)
  idmap_rid_initialize: range=600001-900000
[2007/04/10 12:37:10, 10] nsswitch/idmap.c:idmap_init(471)
  Domain hpcif20dom - Backend rid - not default - readonly
...
[2007/04/10 12:37:10, 0] nsswitch/idmap_rid.c:idmap_rid_initialize(47)
  idmap_rid_initialize: begin hpcif20dom
[2007/04/10 12:37:10, 0] nsswitch/idmap_rid.c:idmap_rid_initialize(70)
  idmap_rid_initialize: range=600001-900000
...
[2007/04/10 12:37:10, 10] nsswitch/idmap_tdb.c:idmap_tdb_open_db(263)
  Opening tdbfile /var/opt/samba/locks/winbindd_idmap.tdb
[2007/04/10 12:37:10, 1] nsswitch/idmap_tdb.c:idmap_tdb_alloc_init(402)
  idmap uid range missing or invalid
  idmap will be unable to map foreign SIDs
[2007/04/10 12:37:10, 0] nsswitch/idmap.c:idmap_init(632)
  idmap_init: Initialization failed for alloc backend tdb
[2007/04/10 12:37:10, 0] nsswitch/idmap.c:idmap_init(656)
  Aborting IDMAP Initialization ...
[2007/04/10 12:37:10, 10] nsswitch/idmap_tdb.c:idmap_tdb_alloc_close(602)
  idmap_tdb_alloc_close idmap_tdb_tdb_close() OK
[2007/04/10 12:37:10, 1] nsswitch/idmap.c:idmap_sids_to_unixids(1253)
  idmap_sids_to_unixids: idmap_init() failed
[2007/04/10 12:37:10, 10] nsswitch/idmap_util.c:idmap_sid_to_uid(114)
  error mapping sid [S-1-5-21-438457957-3677876637-59957208-73028] to uid
Comment 7 Gerald (Jerry) Carter (dead mail address) 2007-04-10 17:36:52 UTC
I see some bugs here.  Will post a patch in the next 24 hours hopefully.
Comment 8 Gerald (Jerry) Carter (dead mail address) 2007-04-10 18:02:25 UTC
Yong, please try the patch attached to BUG 4501 and see where 
we stand.  It allows idmap_init() to continue even if no 
"idmap alloc backend" has been defined.  And it no longer 
includes tdb as the default idmap alloc backend.  I don't think
this bug is fixed quite yet but would like to reset after 
that patch and see where we stand.
Comment 9 Ying Li 2007-04-10 18:35:38 UTC
Applied the patch of BUG 4501 to 3.0.25pre2. Still got a PANIC in log.winbindd for tdb backend. 
#4  0x435edb0:0 in smb_panic+0x820 ()
#5  0x4312f30:0 in fault_report+0x940 ()
#6  0x43130b0:0 in sig_fault+0x30 ()
#7  <signal handler called>
#8  0x4197fb0:1 in async_request+0x551 ()
#9  0x4208240:0 in sendto_child+0x100 ()
#10 0x4208420:0 in winbindd_allocate_gid+0x140 ()
#11 0x4196630:0 in process_request+0x3e0 ()
#12 0x4196160:0 in request_recv+0xe0 ()
#13 0x4195f10:0 in request_main_recv+0x3f0 ()
#14 0x4194bd0:0 in rw_callback+0x3f0 ()
#15 0x420fa10:0 in process_loop+0x4a0 ()
#16 0x41477d0:0 in main+0xa10 ()
Comment 10 Gerald (Jerry) Carter (dead mail address) 2007-04-10 18:47:51 UTC
ok.  I'll fix this tomorrow.  I'm pretty sure I know what's going on now.
Comment 11 Ying Li 2007-04-10 20:16:34 UTC
Thank you, Jerry. 

It seems idmap_init() did not initialize alloc_methods variable properly, so that the statement "return alloc_methods->allocate_id(id);" in  idmap_allocate_uid()/gid() function calls broken with a core. This was what I'm finding.
Comment 12 Gerald (Jerry) Carter (dead mail address) 2007-04-11 07:25:13 UTC
Ying, I've added a new patch to BUG 4501 which protects against 
the allocate_gid() crashes.  I think that this should resolve 
your issues.  Please reopen if you still get a crash.
Comment 13 Ying Li 2007-04-12 12:54:02 UTC
Hi Jerry, I tested the new patch. It can resolve net idmap dump crash and ALLOCATE_GID/UID crash. thanks. :)