Bug 282 - winbind crashes (core dump) on 'getent group'
winbind crashes (core dump) on 'getent group'
Status: CLOSED FIXED
Product: Samba 3.0
Classification: Unclassified
Component: winbind
3.0.0preX
All Solaris
: P1 major
: 3.0.0rc3
Assigned To: Tim Potter
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2003-08-08 12:12 UTC by Brian King
Modified: 2005-11-14 09:29 UTC (History)
1 user (show)

See Also:


Attachments
core dump (398.61 KB, application/octet-stream)
2003-08-08 12:17 UTC, Brian King
no flags Details
log.winbind at debug level 10. (17.24 KB, text/plain)
2003-08-11 11:35 UTC, Brian King
no flags Details
Backtrace of a non-stripped winbindd - more detail than other backtraces (8.33 KB, text/plain)
2003-08-12 10:57 UTC, Brian King
no flags Details
Check return code of ads_search_retry() (466 bytes, patch)
2003-08-12 19:19 UTC, Tim Potter
no flags Details
"smbd -id 10" from comment 8 (36.47 KB, text/plain)
2003-08-14 05:09 UTC, Brian King
no flags Details
"winbind -d10" : "groups SNB.CA:xbking" returns 1 of 6 groups (see comment 8) (26.92 KB, text/plain)
2003-08-14 05:24 UTC, Brian King
no flags Details
winbind coredump while doing "groups SNB.CA:xbking" (7.50 KB, text/plain)
2003-08-15 11:36 UTC, Brian King
no flags Details
log.winbind associated with previous core/back trace (14.56 KB, text/plain)
2003-08-15 11:40 UTC, Brian King
no flags Details
winbind -d10 for comment 25 (10.12 KB, text/plain)
2003-08-15 13:23 UTC, Brian King
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Brian King 2003-08-08 12:12:27 UTC
bash-2.05# winbindd
bash-2.05# ps -ef | grep winbind
    root  3637     1  0 15:34:00 ?        0:00 winbindd
    root  3638  3637  0 15:34:00 ?        0:00 winbindd
bash-2.05# getent group
... lists all unix groups ...
bash-2.05# ps -ef | grep winbind
bash-2.05# ls -l core
-rw-------    1 root     other     2015120 Aug  8 15:49 core

============
getent passwd works fine.
wbinfo -g will list Active Directory groups without crashing winbind.
(A list of groups returned by wbinfo is below in case it's useful)

This also causes winbindd to core:
touch /tmp/test
chgrp 10001 /tmp/test
ls -l /tmp/test

I'd like to attach my core file, but I don't see any way to do that here?!?

Other potentially relevant info:
===smb.conf===
# Global parameters
[global]
        workgroup = MYGROUP
        realm = SNB.CA
        server string = Samba Server
        security = ADS
        password server = snb-fton-ad1.snb.ca snb-fton-ad3.snb.ca
        log file = /usr/local/samba/var/log.%m
        max log size = 50
        load printers = No
        preferred master = No
        local master = No
        domain master = No
        idmap uid = 10000-20000
        idmap gid = 10000-20000
        template homedir = /export/home/%D/%U
        template shell = /bin/bash
        winbind separator = :
        winbind cache time = 10

[homes]
        comment = Home Directories
        read only = No
        browseable = No

[public]
        comment = Public Stuff
        path = /home/samba
        write list = xjim, xbking
        read only = No
        guest ok = Yes

======
bash-2.05# uname -a
SunOS SNB-FTON-DEV8 5.9 Generic_112233-06 sun4u sparc SUNW,Sun-Blade-100 Solaris
======= Entries written to /usr/local/samba/var/log.winbindd after 'ls' causes 
core ====

[2003/08/08 16:06:29, 1] libads/ldap_utils.c:ads_do_search_retry(76)
  ads reopen failed after error Success
Assertion failed: entry != NULL, file getvalues.c, line 36

==== log.winbindd entries after wbinfo -g =====
...
[2003/08/08 16:03:44, 1] nsswitch/winbindd_ads.c:enum_dom_groups(269)
  No rid for Users !?
[2003/08/08 16:03:44, 1] nsswitch/winbindd_ads.c:enum_dom_groups(269)
  No rid for Guests !?
[2003/08/08 16:03:44, 1] nsswitch/winbindd_ads.c:enum_dom_groups(269)
  No rid for Account Operators !?
[2003/08/08 16:03:44, 1] nsswitch/winbindd_ads.c:enum_dom_groups(269)
  No rid for Server Operators !?
[2003/08/08 16:03:44, 1] nsswitch/winbindd_ads.c:enum_dom_groups(269)
  No rid for Print Operators !?
[2003/08/08 16:03:44, 1] nsswitch/winbindd_ads.c:enum_dom_groups(269)
  No rid for Backup Operators !?
[2003/08/08 16:03:44, 1] nsswitch/winbindd_ads.c:enum_dom_groups(269)
  No rid for Replicator !?
[2003/08/08 16:03:44, 1] nsswitch/winbindd_ads.c:enum_dom_groups(269)
  No rid for Pre-Windows 2000 Compatible Access !?
[2003/08/08 16:03:44, 1] nsswitch/winbindd_ads.c:enum_dom_groups(269)
  No rid for Administrators !?
...

===== I get these messages in several of the logs, and from some of the cmd 
line utils on STDOUT =====

[2003/08/08 13:34:46, 0] lib/module.c:smb_load_module(40)
  Error loading module '/usr/local/samba/lib/charset/CP850.so': ld.so.1: ./winbi
ndd: fatal: /usr/local/samba/lib/charset/CP850.so: open failed: No such file or
directory
[2003/08/08 13:34:46, 0] lib/charcnv.c:init_iconv(134)
  Conversion from UCS-2LE to CP850 not supported
[2003/08/08 13:34:46, 0] lib/module.c:smb_load_module(40)
  Error loading module '/usr/local/samba/lib/charset/CP850.so': ld.so.1: ./winbi
ndd: fatal: /usr/local/samba/lib/charset/CP850.so: open failed: No such file or
directory
[2003/08/08 13:34:46, 0] lib/charcnv.c:init_iconv(134)
  Conversion from UTF8 to CP850 not supported
[2003/08/08 13:34:46, 0] lib/module.c:smb_load_module(40)
  Error loading module '/usr/local/samba/lib/charset/CP850.so': ld.so.1: ./winbi
ndd: fatal: /usr/local/samba/lib/charset/CP850.so: open failed: No such file or
directory

===== wbinfo -g   (368 groups) =====
SNB:SNB-KIOSK-USERS-PLANET-G
SNB:xwave-Helpdesk-G
SNB:xwave-Oracle-DBA-G
SNB:xwave-Admin-G
SNB:SNB-Admin-G
SNB:(SNB) PAM Emai-1
SNB:(SNB) PAM Test Email
SNB:(SNB) PAM Test-1
SNB:(SNB) Reg. Ass't Man
SNB:(SNB) Corporate Fold
SNB:(SNB) HR-Payroll Sup
SNB:(SNB) HR-Payroll
SNB:(SNB) CSS Steering C
SNB:(SNB) CSS ECI Contro
SNB:(SNB) Chaleur Bath.
SNB:(SNB) Chaleur Team P
SNB:(SNB) CSS Team
SNB:(SNB) Fundy Assm't-L
SNB:(SNB) Administrative
SNB:(SNB) HQ Executive C
SNB:(SNB) HQ Legislative
SNB:(SNB) PLANET Mgmt Co
SNB:(SNB) Fundy Assm't-H
SNB:(SNB) HQ Corporate L
SNB:(SNB) Chaleur Assm't
SNB:(SNB) Planet Acnt Ch
SNB:(SNB) Chaleur Campbe
SNB:(SNB) Chaleur Assess
SNB:(SNB) Customer Servi
SNB:(SNB) Planet E-1
SNB:(SNB) Fundy Assm't-S
SNB:(SNB) xwave CSC Leve
SNB:(SNB) Service Qualit
SNB:(SNB) RPIIS Team
SNB:(SNB) BRS Support
SNB:(SNB) HQ Planning &
SNB:(SNB) BRS Test Tech
SNB:(SNB) BRS ECI Test S
SNB:(SNB) Planet Archive
SNB:(snb) xwave IISSQL T
SNB:(SNB) Valley Assess
SNB:(SNB) Valley Assessm
SNB:(SNB) Valley Support
SNB:(SNB) Fton Assessmen
SNB:(SNB) Valley Admin S
SNB:(SNB) Valley Assesso
SNB:(SNB) Valley View Ne
SNB:(SNB) ADAM Download
SNB:(SNB) Cust. Serv. Te
SNB:(SNB) Bouctouche SD
SNB:(SNB) Beau C.S.Team
SNB:(SNB) Beau Managemen
SNB:(SNB) Beau Moncton M
SNB:(SNB) Auto Dealer Su
SNB:(SNB) Beau Rbto R&M
SNB:(SNB) Beau Region De
SNB:(SNB) Beau Richibuct
SNB:(SNB) Registered Aut
SNB:(SNB) Beau Ric-1
SNB:(SNB) Beau Sackville
SNB:(SNB) Beau Store CS
SNB:(SNB) Beau Team (Ann
SNB:(SNB) Beau Technical
SNB:(SNB) Beau Rbto Assm
SNB:(SNB) Beau Shediac S
SNB:(SNB) Beau Team (Nan
SNB:(SNB) Beau Service C
SNB:(SNB) ECC United Way
SNB:(SNB) SE Directors U
SNB:(SNB) xwave Administ
SNB:(SNB) HQ Information
SNB:(SNB) Fundy R&M-St S
SNB:(SNB) Beau Mappers
SNB:(SNB) JDE Support Gr
SNB:(SNB) IRP EFT
SNB:PLANETHD
SNB:(SNB) Chaleur -1
SNB:(SNB) Chaleur -2
SNB:(SNB) Chaleur Mirami
SNB:(SNB) Fundy Drvr Exa
SNB:(SNB) CSS ECI Driver
SNB:(SNB) CTI 1.1 COGNOS
SNB:(SNB) Valley Managem
SNB:(SNB) CARD Reports U
SNB:(SNB) CARD Alerts-In
SNB:(SNB) CARD Alerts-Up
SNB:(SNB) Public Folder
SNB:(SNB) PLANET OCR
SNB:(SNB) Chaleur Driver
SNB:(SNB) Ass't H.I.T.F
SNB:(SNB) Beau Moncton D
SNB:(SNB) Chaleur Serv.
SNB:(SNB) CARS_alert
SNB:(SNB) ECI_alert
SNB:(SNB) Beau Driver Ex
SNB:(SNB) Regional Servi
SNB:(SNB) Valley Driver
SNB:(SNB) xwave Team
SNB:(SNB) xwave DBA's
SNB:(SNB) Chaleur CS Tea
SNB:(SNB) Chaleur Dashbo
SNB:(SNB) Ass't Team Lea
SNB:(SNB) Fundy As-1
SNB:(SNB) Fundy Assm't-M
SNB:(SNB) Chaleur Mapper
SNB:(SNB) Mappers
SNB:(SNB) ERP Users
SNB:(SNB) HQ Development
SNB:(SNB) Planet A-1
SNB:(SNB) HQ Marketing D
SNB:(SNB) HQ Devel-1
SNB:(SNB) Charleur Asses
SNB:(SNB) CARD Reports I
SNB:(SNB) HQ Corporate A
SNB:(SNB) CSS ECI TeleSe
SNB:(SNB) Chaleur -3
SNB:(SNB) PLANET Registr
SNB:(SNB) Boeckh Regiona
SNB:(SNB) Beau-Assm't (J
SNB:(SNB) Beau-Assm't (R
SNB:(SNB) Fredericton Ci
SNB:(SNB) TSAC Members
SNB:(SNB) Chaleur Restig
SNB:(SNB) Beau Assessmen
SNB:(SNB) Beau-Assm't (G
SNB:(SNB) A-cubed Team
SNB:(SNB) JDE Users
SNB:(SNB) ECI Assessment
SNB:(SNB) Network Outage
SNB:(SNB) Beau-Assm't (L
SNB:(SNB) Edge Consolida
SNB:(SNB) Fundy CS-St St
SNB:(SNB) Fundy Mappers
SNB:(SNB) Fundy R&M-Hamp
SNB:(SNB) Fundy All St S
SNB:(SNB) Fundy R&M-Sain
SNB:(SNB) Valley Mapping
SNB:(SNB) Valley Registr
SNB:(SNB) Fundy LeadsMgr
SNB:(SNB) Fundy CS Leads
SNB:(SNB) CSS ECI IRP Su
SNB:(SNB) BRS Technical
SNB:(SNB) CSS ECI Public
SNB:(SNB) Edge Con-1
SNB:(SNB) IT Technicians
SNB:(SNB) Security Respo
SNB:(SNB) HQ Corporate C
SNB:(SNB) Fundy CS-Susse
SNB:(SNB) Beau Assm't Ma
SNB:(SNB) Beau Team (Joa
SNB:(SNB) Valley Service
SNB:(SNB) Beau Semi-Tech
SNB:(SNB) CSS ECI Intran
SNB:(SNB) HQ Answering P
SNB:(SNB) Beau Health&Sa
SNB:(SNB) Valley Regiona
SNB:(SNB) Provincial Rea
SNB:(SNB) CARegistration
SNB:(SNB) Registry Conta
SNB:(SNB) Customer-1
SNB:(SNB) Fundy All Hamp
SNB:(SNB) Customer-2
SNB:(SNB) CARegist-1
SNB:(SNB) TeleServices A
SNB:(SNB) Chaleur Region
SNB:(SNB) Chaleur R&M St
SNB:(SNB) Chaleur -4
SNB:(SNB) ErrList-A3 JDE
SNB:(SNB) Registry Offic
SNB:(SNB) Fundy CS-Saint
SNB:(SNB) Online Annual
SNB:(SNB) Online A-1
SNB:(SNB) Finance Fax
SNB:(SNB) HQ College Hil
SNB:(SNB) Ass't Manageme
SNB:(SNB) HQ Electronic
SNB:(SNB) NBAAO
SNB:(SNB) Fundy Admin Te
SNB:(SNB) HQ Elect-1
SNB:(SNB) Beau Service R
SNB:(SNB) Chaleur Ass. A
SNB:(SNB) Fundy As-2
SNB:(SNB) Fundy Reg Asse
SNB:(SNB) Beau Mctn Assm
SNB:(SNB) Bathurst Emplo
SNB:(SNB) Beau Region St
SNB:(SNB) InfoSource Coo
SNB:(SNB) xwave Exchange
SNB:(SNB) HQ FIN and ADM
SNB:(SNB) Fundy All Sain
SNB:(SNB) Fundy Regional
SNB:(SNB) Beau Support S
SNB:(SNB) All Exchange U
SNB:(SNB) HQ Westmorland
SNB:(SNB) HQ Human Resou
SNB:(SNB) Chaleur Team L
SNB:(SNB) HQ Ass't Staff
SNB:(SNB) CSS ECI Paymen
SNB:(SNB) Field Services
SNB:(SNB) CSS ECI -1
SNB:(SNB) Chaleur Operat
SNB:(SNB) CSS ECI Suppor
SNB:(SNB) POS TEAM
SNB:(SNB) HQ Ass't Techn
SNB:(SNB) ECI Financial
SNB:(SNB) HQ Fin. Serv.
SNB:(SNB) HQ Fin. Serv.A
SNB:(SNB) HQ Fin. -1
SNB:(SNB) HQ Fin. -2
SNB:(SNB) HQ Fin. -3
SNB:(SNB) HQ Fin. -4
SNB:SNBINCOMINGFAX
SNB:(SNB) R4 Test
SNB:SNBPTINCOMINGFAX
SNB:(SNB) CSS Testers
SNB:(SNB) PAM ESD Suppor
SNB:(SNB) HQ Corporate S
SNB:(SNB) GIMAC group
SNB:(SNB) Planet Email F
SNB:(SNB) PLANET Communi
SNB:(SNB) Operations Man
SNB:(SNB) PAM Email Fax
SNB:MVInventory-CMTN-M
SNB:Assessment Forms-CMTN-M
SNB:Reports ST-R
SNB:MVInventory-CRQT-M
SNB:Reports ST-M
SNB:MVInventory-BRTH-M
SNB:MVInventory-CHMN-M
SNB:MVInventory-DLHS-M
SNB:MVInventory-Dealers-M
SNB:Assessors
SNB:MVInventory-DKTN-M
SNB:MVInventory-EDMN-M
SNB:Service Counter Lists-M
SNB:Service Counter Lists-R
SNB:MVInventory-FTON-M
SNB:MVInventory-FSS-M
SNB:MVInventory-GGTW-M
SNB:FS5-R
SNB:Building Permits-RBTO-M
SNB:MVInventory-GDFL-M
SNB:SMS2-Include-G
SNB:SNB-Test-A2GPOSRpt-W
SNB:MVInventory-GDMN-M
SNB:Dalhouse-Image-M
SNB:MVInventory-HMPN-M
SNB:Reports AT-A2G-SUSX-R
SNB:MVInventory-BRTN-M
SNB:DBA-M
SNB:MVInventory-HWCP-M
SNB:Reports AT-A2G-SUSX-M
SNB:Openview Users-R
SNB:SMSRemoteControlUsers
SNB:Infrastructure Info-M
SNB:Contracts-M
SNB:MVInventory-IRP-FTON-M
SNB:SNB-Software-Install-G
SNB:SNB-ITTech-G
SNB:MVInventory-KDWK-M
SNB:SNB-HelpDesk-Admin-G
SNB:SNB Exchange Admins
SNB:Crystal Info
SNB:SMSWebAccessUsers
SNB:MVInventory-MADM-M
SNB:JDE Administrators
SNB:MVInventory-MMCH-East-M
SNB:Quality Control
SNB:MVIRP-G
SNB:Education IVR-R
SNB:CSS
SNB:MVInventory-MMCH-West-M
SNB:CA Mailbox Send As
SNB:SNB Mailbox Send As
SNB:JDEAdmin
SNB:MVInventory-MCTN-M
SNB:DnsAdmins
SNB:SMS-SNBRemoteControlUsers
SNB:DnsUpdateProxy
SNB:Assdata-M
SNB:MVInventory-PRTH-M
SNB:MVInventory-PLRK-M
SNB:MVInventory-PTEL-M
SNB:SMSQueryCreators
SNB:MVInventory-RBTO-M
SNB:PurchaseOrder-M
SNB:MVInventory-SKVL-M
SNB:PurchaseOrder-R
SNB:MVInventory-STJH-M
SNB:MVInventory-STLN-M
SNB:MVInventory-SHDC-M
SNB:MVInventory-SHGN-M
SNB:MVInventory-STGR-M
SNB:Exchange Domain Servers
SNB:MVInventory-SQTN-M
SNB:Exchange Enterprise Servers
SNB:MVInventory-SSTN-M
SNB:MVInventory-SUSX-M
SNB:Notice-Avis Mailbox Send As
SNB:Corporate Info-M
SNB:MVInventory-TeleServices-M
SNB:MVInventory-TRCD-M
SNB:MVInventory-WDST-M
SNB:SNB-TRIM_ADMIN-G
SNB:SMSRemoteControlAdministrators
SNB:Call Centre-M
SNB:SNB-KioskPlanetUsers-g
SNB:Call Centre-R
SNB:Assess HO-M
SNB:CSS-EFORM-ADMIN-G
SNB:Assess HO-R
SNB:Assess Gerard-M
SNB:SMSInternalCliGrp
SNB:Assess Hal-M
SNB:Mobile Homes-RBTO-M
SNB:Mobile Homes-FTON-M
SNB:Mobile Homes-EDMN-M
SNB:Assessment Services Manual-M
SNB:SNB-TDTest-G
SNB:SNB-IEAK-G
SNB:SNB-TRIM_User-G
SNB:Reports-A2G-SusxSC-R
SNB:Reports-A2G-SusxSC-M
SNB:SNB-DR_Exam-G
SNB:SNB-QMVS_Purge-G
SNB:Daily Deposit-CMTN-M
SNB:Domain Users
SNB:Domain Guests
SNB:Inventory System-M
SNB:Exchange View Admins
SNB:Services Mailbox Send As
SNB:FINADMIN-M
SNB:Domain Admins
SNB:ADMINDOC-R
SNB:MVInventory-M
SNB:ADMINDOC-M
SNB:SMSRemoteControlServers
SNB:Pats Sales-M
SNB:Domain Computers
SNB:Domain Controllers
SNB:Cert Publishers
SNB:SNB-EDUC-STUDENTFS-DATA-TEST-G
SNB:RAS and IAS Servers
SNB:SNB-EDUC-STUDENTFS-DATA-G
SNB:Group Policy Creator Owners
SNB:VLTracking-M
SNB:Fixedaso-M
SNB:Reports-R
SNB:Reports-M
SNB:Reports-A2G-R
SNB:xwave Tasks-M
SNB:Reports-A2G-M
SNB:Reports-A2G-OPS-R
SNB:Reports-Dev-R
SNB:Reports-A2G-OPS-M
SNB:MVInventory-R
SNB:Reports AT-R
SNB:Reports AT-M
SNB:Reports AT-A2G-R
SNB:Reports AT-A2G-M
SNB:MVInventory-BTCH-M
SNB:GPO-R
SNB:DHCP Users
SNB:DHCP Administrators
SNB:Reports AT-A2G-OPS-R
SNB:MVInventory-BRST-M
SNB:Reports AT-A2G-OPS-M
SNB:Reports DEV-M
Comment 1 Brian King 2003-08-08 12:17:16 UTC
Created attachment 65 [details]
core dump

This is the core created by:

touch /tmp/test
chgrp 10001 /tmp/test
ls -l /tmp/test
Comment 2 Brian King 2003-08-08 13:58:17 UTC
I installed gdb and grabbed a backtrace in case that helps. The 3 backtraces 
below are from 3 different incidents, 1 caused by the 'ls -l' described early; 
and the other 2 by 'getent group'. They all look the same.

(gdb) bt
#0  0xff01eda0 in _libc_kill () from /usr/lib/libc.so.1
#1  0xfefb5c68 in abort () from /usr/lib/libc.so.1
#2  0xfefb5f08 in _assert () from /usr/lib/libc.so.1
#3  0xff096f90 in ldap_get_values (ld=0x28a1f8, entry=0x0,
    target=0x161b98 "userPrincipalName") at getvalues.c:63
#4  0x00118dd0 in ads_pull_string ()
#5  0x00119240 in ads_pull_username ()
#6  0x00046bc0 in get_ldap_sequence_number ()
#7  0x00048010 in get_ldap_sequence_number ()
#8  0x0003e5fc in centry_start ()
#9  0x000366e0 in winbindd_list_users ()
#10 0x0003835c in winbindd_getgrent ()
#11 0x000336ac in unbecome_root ()
#12 0x000339b4 in winbind_process_packet ()
#13 0x000342c8 in winbind_client_read ()
#14 0x00034914 in main ()


(gdb) bt
#0  0xff01eda0 in _libc_kill () from /usr/lib/libc.so.1
#1  0xfefb5c68 in abort () from /usr/lib/libc.so.1
#2  0xfefb5f08 in _assert () from /usr/lib/libc.so.1
#3  0xff096f90 in ldap_get_values (ld=0x214f20, entry=0x0,
    target=0x161b98 "userPrincipalName") at getvalues.c:63
#4  0x00118dd0 in ads_pull_string ()
#5  0x00119240 in ads_pull_username ()
#6  0x00046bc0 in get_ldap_sequence_number ()
#7  0x00048010 in get_ldap_sequence_number ()
#8  0x0003e5fc in centry_start ()
#9  0x000366e0 in winbindd_list_users ()
#10 0x0003763c in winbindd_getgrgid ()
#11 0x000336ac in unbecome_root ()
#12 0x000339b4 in winbind_process_packet ()
#13 0x000342c8 in winbind_client_read ()
#14 0x00034914 in main ()

(gdb) bt
#0  0xff01eda0 in _libc_kill () from /usr/lib/libc.so.1
#1  0xfefb5c68 in abort () from /usr/lib/libc.so.1
#2  0xfefb5f08 in _assert () from /usr/lib/libc.so.1
#3  0xff096f90 in ldap_get_values (ld=0x26b208, entry=0x0,
    target=0x161b98 "userPrincipalName") at getvalues.c:63
#4  0x00118dd0 in ads_pull_string ()
#5  0x00119240 in ads_pull_username ()
#6  0x00046bc0 in get_ldap_sequence_number ()
#7  0x00048010 in get_ldap_sequence_number ()
#8  0x0003e5fc in centry_start ()
#9  0x000366e0 in winbindd_list_users ()
#10 0x0003763c in winbindd_getgrgid ()
#11 0x000336ac in unbecome_root ()
#12 0x000339b4 in winbind_process_packet ()
#13 0x000342c8 in winbind_client_read ()
#14 0x00034914 in main ()



Comment 3 Brian King 2003-08-11 11:35:51 UTC
Created attachment 67 [details]
log.winbind at debug level 10.

This is 'winbindd -d10' output to log.winbindd. The log only contains the
events from starting of winbindd, immediately followed by 'getent group', which
causes winbindd to core.
Comment 4 Brian King 2003-08-12 10:57:21 UTC
Created attachment 69 [details]
Backtrace of a non-stripped winbindd - more detail than other backtraces

The previous backtrace from gdb was created with a "stripped" version of
winbind. I've recreated the problem with a non stripped version so you can see
more detail.
Comment 5 Tim Potter 2003-08-12 19:19:36 UTC
Created attachment 70 [details]
Check return code of ads_search_retry()

Here's a patch that could fix the problem.  From the full backtrace (thanks -
very helpful!) it seems that the ads_search_retry() is returning OK but with an
NULL result.
Comment 6 Tim Potter 2003-08-12 19:20:01 UTC
Jerry, can you take a look and see if this makes sense?
Comment 7 Brian King 2003-08-13 05:28:27 UTC
That fixed my particular problem.

Side note, that same check occurs after ads_search_retry() in several other 
places within that same file (winbindd_ads.c). Could there be other similar 
problems that just haven't been triggered in my environment?

Line#   TEXT
126:    if (!ADS_ERR_OK(rc)) {
227:    if (!ADS_ERR_OK(rc)) {
439:    if (!ADS_ERR_OK(rc)) {
514:    if (!ADS_ERR_OK(rc)) {
612:    if (!ADS_ERR_OK(rc)) {
627:    if (!ADS_ERR_OK(rc)) {
715:    if (!ADS_ERR_OK(rc)) {
788:    if (!ADS_ERR_OK(rc)) {
841:    if (!ADS_ERR_OK(rc)) {

Thanks for the quick response!
Comment 8 Brian King 2003-08-13 05:49:29 UTC
Something still doesn't work quite right (or at least not the way I'd expect), 
but maybe this is a different bug?

This user is in 6 AD groups.
bash-2.05# wbinfo -r SNB.CA:xbking
10013
10048
10027
10023
10009
10005

bash-2.05# for gid in `wbinfo -r SNB.CA:xbking` ; do
> wbinfo -s `wbinfo -G $gid`
> done
SNB:Domain Users 2
Could not lookup sid S-1-5-32-545
SNB:MVIRP-G 2
SNB:SNB-Admin-G 2
SNB:SNB-EDUC-STUDENTFS-DATA-G 2
SNB:SNB-EDUC-STUDENTFS-DATA-TEST-G 2

Now I have a share defined as:


[public]
   comment = Public Stuff
   path = /home/samba
   public = yes
   writable = yes
   printable = no
   valid users = @SNB.CA:SNB-EDUC-STUDENTFS-DATA-TEST-G

It will not let the user SNB.CA:xbking connect. It doesn't appear to check for 
group permission at all?!?

=======
bash-2.05# /usr/local/samba/bin/smbd -id 3 -s/usr/local/samba/lib/smb.conf
get_current_groups: user is in 11 groups: 1, 0, 2, 3, 4, 5, 6, 7, 8, 9, 12
smbd version 3.0.0beta3 started.
Copyright Andrew Tridgell and the Samba Team 1992-2003
uid=0 gid=0 euid=0 egid=0
lp_load: refreshing parameters
Initialising global parameters
params.c:pm_process() - Processing configuration 
file "/usr/local/samba/lib/smb.conf"
Processing section "[global]"
Processing section "[homes]"
Processing section "[public]"
Processing section "[certmu]"
Processing section "[inuse]"
Processing section "[spad]"
Processing section "[PlanetPATSData]"
adding IPC service
adding IPC service
Failed to load /usr/local/samba/lib/valid.dat - No such file or directory
creating default valid table
added interface ip=10.7.7.5 bcast=10.7.7.255 nmask=255.255.255.0
added interface ip=142.139.95.8 bcast=142.139.95.255 nmask=255.255.252.0
loaded services
Registered MSG_REQ_POOL_USAGE
Registered MSG_REQ_DMALLOC_MARK and LOG_CHANGED
waiting for a connection
open_oplock_ipc: opening loopback UDP socket.
open_oplock ipc: pid = 1187, global_oplock_port = 41572
Transaction 0 of length 183
switch message SMBnegprot (pid 1187)
setting sec ctx (0, 0) - sec_ctx_stack_ndx = 0
Requested protocol [PC NETWORK PROGRAM 1.0]
Requested protocol [MICROSOFT NETWORKS 1.03]
Requested protocol [MICROSOFT NETWORKS 3.0]
Requested protocol [LANMAN1.0]
Requested protocol [LM1.2X002]
Requested protocol [DOS LANMAN2.1]
Requested protocol [Samba]
using SPNEGO
Selected protocol NT LANMAN 1.0
Transaction 1 of length 1414
switch message SMBsesssetupX (pid 1187)
setting sec ctx (0, 0) - sec_ctx_stack_ndx = 0
wct=12 flg2=0xc801
Doing spnego session setup
NativeOS=[Unix] NativeLanMan=[Samba]
Got OID 1 2 840 48018 1 2 2
Got OID 1 3 6 1 4 1 311 2 2 10
Got secblob of size 1273
Ticket name is [xbking@SNB.CA]
push_sec_ctx(0, 0) : sec_ctx_stack_ndx = 1
push_conn_ctx(0) : conn_ctx_stack_ndx = 0
setting sec ctx (0, 0) - sec_ctx_stack_ndx = 1
pop_sec_ctx (0, 0) - sec_ctx_stack_ndx = 0
fetch sid from gid cache 10013 -> S-1-5-21-1416833156-1238969774-10498456-513
User name: SNB.CA:xbking        Real name: Brian King (SNB/xwave)
UNIX uid 10010 is UNIX user SNB.CA:xbking, and will be vuid 100
Adding/updating homes service for user 'SNB.CA:xbking' using home 
directory: '/export/home/SNB.CA/xbking'
adding home's share [xbking] for user 'SNB.CA:xbking' 
at '/export/home/SNB.CA/xbking'
Transaction 2 of length 100
switch message SMBtconX (pid 1187)
setting sec ctx (0, 0) - sec_ctx_stack_ndx = 0
user 'SNB.CA:xbking' (from session setup) not permitted to access this share 
(public)
error string = Bad file number
error packet at smbd/reply.c(268) cmd=117 (SMBtconX) NT_STATUS_ACCESS_DENIED
Transaction 3 of length 45
switch message SMBclose (pid 1187)
error packet at smbd/process.c(719) cmd=4 (SMBclose) 
NT_STATUS_NETWORK_NAME_DELETED
end of file from client
setting sec ctx (0, 0) - sec_ctx_stack_ndx = 0
Closing connections
Yielding connection to
Server exit (normal exit)

===
I expected groups to work this way because users do work this way.

   valid users = @SNB.CA:xbking

Does work for access.

====
This might be related. The "groups" command does not pick up all the AD users 
groups:

bash-2.05# groups SNB.CA:xbking
SNB:Domain Users

Comment 9 Brian King 2003-08-13 07:56:33 UTC
Correction to the last comment.

   valid users = SNB.CA:xbking

works for access to the share. xbking is a user, not a group.
Comment 10 Tim Potter 2003-08-13 12:52:14 UTC
The groups command not picking up all the groups is probably causing the access
check to fail.  Would you be able to post:

  - smbd debug level 10 (instead of level 3) from comment 8
  - winbindd debug level 10 when doing the groups command that doesn't return
the full group membership

Thanks!
Comment 11 Tim Potter 2003-08-13 12:53:30 UTC
Jerry dude, what do you think about applying the NULL check patch to the result
of all searches in winbindd_ads.c?
Comment 12 Brian King 2003-08-14 05:09:34 UTC
Created attachment 77 [details]
"smbd -id 10" from comment 8

I am probably reading the output in the attachment wrong, but does the line:

NT user token of user S-1-5-21-1298324328-75492828-1082003079-21020

mean that the sid of SNB.CA:xbking should be that sid? SNB.CA:xbking is:

bash-2.05# wbinfo -n SNB.CA:xbking
S-1-5-21-1416833156-1238969774-10498456-5909 1
Comment 13 Brian King 2003-08-14 05:24:52 UTC
Created attachment 78 [details]
"winbind -d10" : "groups SNB.CA:xbking" returns 1 of 6 groups (see comment 8)
Comment 14 Gerald (Jerry) Carter 2003-08-14 16:13:31 UTC
I think that checking the return codes is about the bext 
ew can do.  Looking at the backtrace, i wonder if a user 
has not been deleted from the directory ut not removed from a group.
However, whenever I remove a user, the group membership is cleaned
up as well.
Comment 15 Gerald (Jerry) Carter 2003-08-14 19:02:59 UTC
Are you working off the latest SAMBA_3_0 cvs code ?  I fixed a problem 
with secondary groups last week.
Comment 16 Brian King 2003-08-15 05:25:24 UTC
No, working off of the 3.0b3 code, plus the patch attached to this bug. Should 
I be switching?
Comment 17 Gerald (Jerry) Carter 2003-08-15 08:09:30 UTC
If you wouldn't mind testing the latest CVS for the issing 
supplementary groups problem that would be good.
Comment 18 Brian King 2003-08-15 11:02:21 UTC
I haven't used the CVS before, so I could have done this wrong, but I think I 
downloaded the latest source, and now nothing seems to be working.
- I can't connect with "smbclient -k" like I used to be able to.
- "wbinfo -g" doesn't return anything
- getent passwd shows the local users and hangs
- Stopping samba, winbindd could not be killed without 'kill -9'
- "ls -l" on a file owned by group 10001 no longer shows the AD group, but 
doesn't kill winbindd or hang like it did with 3.0b3

smbclient can still be used to connect to windows based shares. So the network, 
kerberos, and authentication are still working from the client side.

This is what I am seeing in the log.winbindd (no debug - I'll send debug 
attachment in a while):

  winbindd version 3.0.0rc1 started.
  Copyright The Samba Team 2000-2003
[2003/08/15 14:57:27, 0] nsswitch/winbindd.c:process_loop(722)
  process_loop: Invalid request size from pid 22428: 1312 bytes sent, should be 
1568
[2003/08/15 14:57:35, 0] nsswitch/winbindd.c:process_loop(722)
  process_loop: Invalid request size from pid 22428: 1312 bytes sent, should be 
1568


==
That PID corresponds to:
bash-2.05# ps -fp 22428
     UID   PID  PPID  C    STIME TTY      TIME CMD
    root 22428     1  0   Aug 08 ?        0:22 /usr/sbin/nscd

===
Comment 19 Brian King 2003-08-15 11:20:18 UTC
Interesting... instead of stopping the processes this time, I tried to change 
the debug level on the fly with smbcontrol.

I had a "wbinfo -g" running and appearing hung.
Within a second or so of sending "./smbcontrol winbindd debug 10", the wbinfo -
g returned with a LONG list of groups.

Where I used to only get groups in my "container" (SNB.CA), I now suddenly get 
groups belonging to what I believe is the PARENT container for SNB.CA (GNB.CA), 
and also groups belonging to "MYGROUP" which I believe is a child of the SNB.CA 
container that my server is registered in.

I believe the "hang" might have just been a long delay because of the vast 
increase in the number of objects needing to be returned. Used to be ~370 
groups, now there are ~7100 groups. Users grew from __ to ~8000 a while ago, to 
~18000 at last try (based on "wbinfo -u|wc -l").

I'll do a little more experimenting and report back. If there is anything you'd 
like me to try/change/check, let me know.
Comment 20 Brian King 2003-08-15 11:36:54 UTC
Created attachment 80 [details]
winbind coredump while doing "groups SNB.CA:xbking"

Tried:
bash-2.05# groups SNB.CA:xbking
SNB.CA:xbking : SNB:Domain Users

While groups was running,after ~30 seconds in another window I did:
./smbcontrol winbindd debug 10

It returned my groups (well 1 anyway), but winbind cored.
Comment 21 Brian King 2003-08-15 11:40:45 UTC
Created attachment 81 [details]
log.winbind associated with previous core/back trace
Comment 22 Brian King 2003-08-15 12:24:56 UTC
More info. I started "winbind -g" and it's been running 10 minutes now. This 
time winbindd was running at debug level 10 from the start and I can see the 
cause of the delays.

It's trying to contact every AD server listed in the "trusted" group for our AD 
domain/realm. Many of them are firewalled so that my UNIX Samba server can't 
reach them. It's taking ~225 seconds per IP address to timeout:

...
 remove_duplicate_addrs2: looking for duplicate address/port pairs
[2003/08/15 15:42:20, 5] libsmb/namecache.c:namecache_store(131)
  namecache_store: storing 11 addresses for nbed.nb.ca#1c: 
142.139.201.68:389,204.81.12.61:389,204.81.8.128:389,204.81.12.87:389,204.82.79.
51:389,204.82.78.122:389,204.82.79.54:389,204.82.78.249:389,204.81.189.104:389,2
04.81.0.121:389,204.81.0.215:389
[2003/08/15 15:42:20, 10] lib/gencache.c:gencache_set(126)
  Adding cache entry with key = NBT/NBED.NB.CA#1C; value = 
142.139.201.68:389,204.81.12.61:389,204.81.8.128:389,204.81.12.87:389,204.82.79.
51:389,204.82.78.122:389,204.82.79.54:389,204.82.78.249:389,204.81.189.104:389,2
04.81.0.121:389,204.81.0.215:389 and timeout                    = Fri Aug 15 
15:53:20 2003
   (660 seconds ahead)
[2003/08/15 15:42:20, 10] libsmb/namequery.c:internal_resolve_name(1099)
  internal_resolve_name: returning 11 addresses: 142.139.201.68:389 
204.81.12.61:389 204.81.8.128:389 204.81.12.87:389 204.82.79.51:389 
204.82.78.122:389 204.82.79.54:389 204.82.78.249:389 204.81.189.104:389 
204.81.0.121:389 204.81.0.215:389
[2003/08/15 15:42:20, 5] libads/ldap.c:ads_try_connect(56)
  ads_try_connect: trying ldap server '142.139.201.68' port 389
[2003/08/15 15:46:05, 10] libsmb/conncache.c:add_failed_connection_entry(132)
  add_failed_connection_entry: added domain nbed.nb.ca (142.139.201.68) to 
failed conn cache
[2003/08/15 15:46:05, 5] libads/ldap.c:ads_try_connect(56)
  ads_try_connect: trying ldap server '204.81.12.61' port 389
[2003/08/15 15:46:05, 10] libsmb/conncache.c:add_failed_connection_entry(132)
  add_failed_connection_entry: added domain nbed.nb.ca (204.81.12.61) to failed 
conn cache
[2003/08/15 15:46:05, 5] libads/ldap.c:ads_try_connect(56)
  ads_try_connect: trying ldap server '204.81.8.128' port 389
[2003/08/15 15:46:05, 10] libsmb/conncache.c:add_failed_connection_entry(132)
  add_failed_connection_entry: added domain nbed.nb.ca (204.81.8.128) to failed 
conn cache
[2003/08/15 15:46:05, 5] libads/ldap.c:ads_try_connect(56)
  ads_try_connect: trying ldap server '204.81.12.87' port 389

[2003/08/15 15:49:49, 10] libsmb/conncache.c:add_failed_connection_entry(132)
  add_failed_connection_entry: added domain nbed.nb.ca (204.81.12.87) to failed 
conn cache
[2003/08/15 15:49:49, 5] libads/ldap.c:ads_try_connect(56)
  ads_try_connect: trying ldap server '204.82.79.51' port 389
[2003/08/15 15:53:34, 10] libsmb/conncache.c:add_failed_connection_entry(132)
  add_failed_connection_entry: added domain nbed.nb.ca (204.82.79.51) to failed 
conn cache
[2003/08/15 15:53:34, 5] libads/ldap.c:ads_try_connect(56)
  ads_try_connect: trying ldap server '204.82.78.122' port 389
....

These other trusts were set up so that people in our domain could use resources 
in the other domains, but not the reverse. As such, we can't contact most of 
the other AD servers. I'm guessing there must be a config option that I missed 
to limit me to only my container, and  not look at the trusts?

The core dump is probably still an issue for you though since smbcontrol run at 
certain times seems to be able to take out winbindd.
Comment 23 Tim Potter 2003-08-15 12:32:59 UTC
The invalid request size message means that you either haven't installed a new
version of libnss_winbind.so, or there is a running process that has a mmapped()
copy of the old library.  The process will need to be restarted.
Comment 24 Tim Potter 2003-08-15 12:41:09 UTC
Did you apply the patch from comment 5 to the new source tree?  We haven't
committed this patch to CVS yet.
Comment 25 Brian King 2003-08-15 13:15:34 UTC
OK, I found the parameter "allow trusted domains = no" and it has stopped the 
extra long delays from AD servers that are unreachable.

The new CVS code plus the patch above (wasn't applied in CVS), does not fix my 
original problem either.

"getent group" does seem to return some of the proper entries, but they don't 
agree with the groups command, or the wbinfo -r command:


bash-2.05# for gid in `wbinfo -r SNB.CA\\\\xnoel` ; do wbinfo -s `wbinfo -G 
$gid` ; done
SNB\Domain Users 2
Could not lookup sid S-1-5-32-544
Could not lookup sid S-1-5-32-545
Could not lookup sid S-1-5-32-550
SNB\SMSRemoteControlServers 2
SNB\Domain Admins 2

bash-2.05# groups SNB.CA\\xnoel
SNB:Domain Users

bash-2.05# getent group | grep xnoel
SNB\Domain 
Admins:x:10001:SNB\xbert,SNB\xnoel,SNB\SMSService,SNB\sms,SNB\Admin,SNB\arcserve
,SNB\oper
SNB\SMSRemoteControlServers:x:10340:SNB\xbert,SNB\xnoel



* I restarted nscd and that got rid of the invalid request size problem

Comment 26 Brian King 2003-08-15 13:23:27 UTC
Created attachment 82 [details]
winbind -d10 for comment 25

It looks like winbind is rotating the log file and I may not have captured
everything here. If you need more I'll try that again. I do see a "Bad search
filter" message in there several times, that might be related? Also a confusing
message "ads reopen failed after error Success" sounds like it thinks "Success"
is a bad thing?!?
Comment 27 Brian King 2003-08-18 06:56:48 UTC
Good news...
The original functionality I was looking for (the ability to use AD groups on 
the "valid users" line in smb.conf) is actually working now. The "groups" 
command still doesn't return what I'd expect, but it has no apparent affect on 
functionality.

The errors in comment 26 , attachment 82 [details] , "bad search filter" seem to just be 
a matter of escaping the brackets. e.g.

---- log.winbindd (-d10) -----
[2003/08/18 09:58:09, 3] libads/ldap.c:ads_do_paged_search(451)
  ldap_search_ext_s((distinguishedName=CN=Stairs\, George (SNB),OU=SNB UsersG,DC
=snb,DC=ca)) -> Bad search filter
[2003/08/18 09:58:09, 3] libads/ldap_utils.c:ads_do_search_retry(60)
  Reopening ads connection to realm 'SNB.CA' after error Bad search filter
------------------------

If I escape the brackets around "SNB", it returns the record it should have.
bash# net ads search '(distinguishedName=CN=Stairs\\, George \(SNB\),OU=SNB 
UsersG,DC=snb,DC=ca)'

* Note, the double backslash before the comma is just because I was executing 
in a shell, it shouldn't normally be necessary

===

This also works now:

bash-2.05# touch /tmp/test
bash-2.05# chown SNB\\xbking:SNB\\Domain\ Users /tmp/test
bash-2.05# ls -l /tmp/test
-rw-r--r--    1 SNB\xbking SNB\Domain Users        0 Aug  8 15:57 /tmp/test

====
The only outstanding problems would be:

- the ldapsearch error in the winbind log that doesn't cause any noticable 
problems YET
- the winbind dumping core with "allow trusted domains = Yes"; and sending 
multiple "smbcontrol winbindd debug 10" messages during a long running "groups 
<ADUser>". (shouldn't have any major impacts)

Would it be easier for you to track if we close this bug report and I open two 
others for those two items?
===
Thanks for all your quick responses and the patch!
Comment 28 Gerald (Jerry) Carter 2003-08-18 09:03:29 UTC
It would be better to open 2 new bugs:

  * ldapsearch errors
  * core dump when "allow trusted domains = yes"

The smbcontrol behavior is expected since Samba
daemons handle messages (and signal processing) in 
band.  winbindd won't do anything about the message 
from smbcontrol until it gets done with the current 
operation.  

I'm marking this issue as fixed.  Please open up 
new bugs for the above issues.  And thanks for you 
help in tracking down these problems.
Comment 29 Brian King 2003-08-28 11:34:04 UTC
Just a reminder, this patch does not seem to have made it into the CVS codebase 
yet (see comment 34). I thought it might have gotten overlooked since the issue 
was marked resolved.

===
A diff between my winbindd_adc.c, and one I just downloaded from CVS.

bash-2.05# diff -u winbindd_ads.c winbindd_ads.c.bk
--- winbindd_ads.c      Thu Aug 28 15:28:01 2003
+++ winbindd_ads.c.bk   Fri Aug 15 09:54:29 2003
@@ -378,7 +378,7 @@
        SAFE_FREE(ldap_exp);
        SAFE_FREE(escaped_dn);

-       if (!ADS_ERR_OK(rc)) {
+       if (!ADS_ERR_OK(rc) || !res) {
                goto failed;
        }
Comment 30 Tim Potter 2003-09-04 22:56:35 UTC
Checked in.

Has anyone opened bugs for the side-issues discovered when fixing this one?
Comment 31 Brian King 2003-09-05 05:02:58 UTC
I opened a bug for the ldapsearch problem (bug 316)

I didn't open one for the 'allow trusted domains = yes'. winbindd probably 
shouldn't have dumped core, but at least part of the problem was the 
environment. We have 1 way trusted and firewalls don't allow communication the 
other way, so I _think_ this is an abnormal situation. I found I needed 'allow 
trusted domains = no' anyway because of the direction of the 1 way trusts.

I don't know if comment 7 has been addressed or not. It doesn't belong as a 
separate bug (I don't think anyway?).
Comment 32 Gerald (Jerry) Carter 2003-09-05 12:53:33 UTC
Tim, there were actually several pointer checks
that we thought were needed (code that had been copied 
from one section to another).  Do we still need then?
Comment 33 Gerald (Jerry) Carter 2003-09-06 12:59:43 UTC
Extra checks added.
Comment 34 Tim Potter 2003-09-10 15:41:58 UTC
I had a brief look around for places for any extra checks but couldn't find any
in that file.

I am wondering whether we should just fix it inside the search function rather
than doing the extra check thing?
Comment 35 Gerald (Jerry) Carter 2003-09-11 05:50:32 UTC
do what you think needs to be done.  We've closed 
this particular bug but if you think we need more done, 
open a new one.
Comment 36 Gerald (Jerry) Carter 2005-02-07 08:41:22 UTC
originally reported against 3.0.0beta3.  CLeaning out 
non-production release versions.
Comment 37 Gerald (Jerry) Carter 2005-08-24 10:18:38 UTC
sorry for the same, cleaning up the database to prevent unecessary reopens of bugs.
Comment 38 Gerald (Jerry) Carter 2005-11-14 09:29:00 UTC
database cleanup