In a relatively recent git build, samba throws a lot of "Unable to convert first SID ..." errors, and users are unable to connect to any samba shares (and encounter lots of other errors). All clients are Windows 8.1 machines & this seems to occur intermittently (50% of the time?). The error was introduced some time between: 9c19532 "s3-shadow-copy2: Protect against already converted names" <---- WORKING and 085c7a7 "smbd: do_lock_cancel does not need "blr" anymore" <---- ERROR (90% sure) It's also broken in the recent: 776c7ed "samlogon_cache: avoid overwriting info3->base.full_name.string." <---- ERROR The error is: [2014/07/16 10:01:31.089670, 0] ../source4/auth/unix_token.c:79(security_token_to_unix_token) Unable to convert first SID (S-1-5-21-1516742906-2559259289-1948011451-1114) in user token to a UID. Conversion was returned as type 2, full token: [2014/07/16 10:01:31.089759, 0] ../libcli/security/security_token.c:63(security_token_debug) Security token SIDs (7): SID[ 0]: S-1-5-21-1516742906-2559259289-1948011451-1114 SID[ 1]: S-1-5-21-1516742906-2559259289-1948011451-513 SID[ 2]: S-1-1-0 SID[ 3]: S-1-5-2 SID[ 4]: S-1-5-11 SID[ 5]: S-1-5-32-545 SID[ 6]: S-1-5-32-554 Privileges (0x 800000): Privilege[ 0]: SeChangeNotifyPrivilege Rights (0x 400): Right[ 0]: SeRemoteInteractiveLogonRight [2014/07/16 10:01:31.126032, 0] ../source4/auth/unix_token.c:79(security_token_to_unix_token) Unable to convert first SID (S-1-5-21-1516742906-2559259289-1948011451-1114) in user token to a UID. Conversion was returned as type 2, full token: [2014/07/16 10:01:31.126104, 0] ../libcli/security/security_token.c:63(security_token_debug) Security token SIDs (7): SID[ 0]: S-1-5-21-1516742906-2559259289-1948011451-1114 SID[ 1]: S-1-5-21-1516742906-2559259289-1948011451-513 SID[ 2]: S-1-1-0 SID[ 3]: S-1-5-2 SID[ 4]: S-1-5-11 SID[ 5]: S-1-5-32-545 SID[ 6]: S-1-5-32-554 Privileges (0x 800000): Privilege[ 0]: SeChangeNotifyPrivilege Rights (0x 400): Right[ 0]: SeRemoteInteractiveLogonRight [2014/07/16 10:01:31.226184, 0] ../source4/auth/unix_token.c:79(security_token_to_unix_token) Unable to convert first SID (S-1-5-21-1516742906-2559259289-1948011451-1114) in user token to a UID. Conversion was returned as type 2, full token: [2014/07/16 10:01:31.226261, 0] ../libcli/security/security_token.c:63(security_token_debug) Security token SIDs (7): SID[ 0]: S-1-5-21-1516742906-2559259289-1948011451-1114 SID[ 1]: S-1-5-21-1516742906-2559259289-1948011451-513 SID[ 2]: S-1-1-0 SID[ 3]: S-1-5-2 SID[ 4]: S-1-5-11 SID[ 5]: S-1-5-32-545 SID[ 6]: S-1-5-32-554 Privileges (0x 800000): Privilege[ 0]: SeChangeNotifyPrivilege Rights (0x 400): Right[ 0]: SeRemoteInteractiveLogonRight [2014/07/16 10:01:53.038474, 2] ../source3/smbd/server.c:457(remove_child_pid) Could not find child 27639 -- ignoring
Slowly working through a bisect here -- this makes samba >085c7a7 completely unusable for us. $ git bisect start # good: [9c195321345846195e465158f88dad5fd7699389] s3-shadow-copy2: Protect against already converted names git bisect good 9c195321345846195e465158f88dad5fd7699389 # bad: [085c7a71fde37cb2a67a4c67673cd22f8acf0861] smbd: do_lock_cancel does not need "blr" anymore git bisect bad 085c7a71fde37cb2a67a4c67673cd22f8acf0861 # good: [fbd418d2f666c3e2af31d34667b63c03fdca6ca6] make: Add gdbtestenv target. git bisect good fbd418d2f666c3e2af31d34667b63c03fdca6ca6 # good: [95a55df021b3f112a18c64a5f5897182ae8b7df8] winbindd: Allow the AD-DC to call getdcname git bisect good 95a55df021b3f112a18c64a5f5897182ae8b7df8 ...
(In reply to comment #1) > Slowly working through a bisect here -- this makes samba >085c7a7 completely > unusable for us. > > > $ git bisect start > # good: [9c195321345846195e465158f88dad5fd7699389] s3-shadow-copy2: Protect > against already converted names > git bisect good 9c195321345846195e465158f88dad5fd7699389 > # bad: [085c7a71fde37cb2a67a4c67673cd22f8acf0861] smbd: do_lock_cancel does not > need "blr" anymore > git bisect bad 085c7a71fde37cb2a67a4c67673cd22f8acf0861 > # good: [fbd418d2f666c3e2af31d34667b63c03fdca6ca6] make: Add gdbtestenv target. > git bisect good fbd418d2f666c3e2af31d34667b63c03fdca6ca6 > # good: [95a55df021b3f112a18c64a5f5897182ae8b7df8] winbindd: Allow the AD-DC to > call getdcname > git bisect good 95a55df021b3f112a18c64a5f5897182ae8b7df8 > ... It is really 085c7a7???? Wow, I am really stunned that this commit breaks login. Can you send a debug level 10 log and your smb.conf file?
> > $ git bisect start > > # good: [9c195321345846195e465158f88dad5fd7699389] s3-shadow-copy2: Protect > > against already converted names > > git bisect good 9c195321345846195e465158f88dad5fd7699389 > > # bad: [085c7a71fde37cb2a67a4c67673cd22f8acf0861] smbd: do_lock_cancel does not > > need "blr" anymore > > git bisect bad 085c7a71fde37cb2a67a4c67673cd22f8acf0861 > > # good: [fbd418d2f666c3e2af31d34667b63c03fdca6ca6] make: Add gdbtestenv target. > > git bisect good fbd418d2f666c3e2af31d34667b63c03fdca6ca6 > > # good: [95a55df021b3f112a18c64a5f5897182ae8b7df8] winbindd: Allow the AD-DC to > > call getdcname > > git bisect good 95a55df021b3f112a18c64a5f5897182ae8b7df8 > > ... > > It is really 085c7a7???? Wow, I am really stunned that this commit breaks > login. > > Can you send a debug level 10 log and your smb.conf file? Still bisecting -- so I believe it's somewhere between: 95a55df021b3f112a18c64a5f5897182ae8b7df8 winbindd: Allow the AD-DC to <--- GOOD 085c7a71fde37cb2a67a4c67673cd22f8acf0861 smbd: do_lock_cancel does not <--- BAD
Created attachment 10119 [details] smb.conf
Any news? Do you have a full debug level 10 log of that failure?
(In reply to comment #5) > Any news? Do you have a full debug level 10 log of that failure? Still bisecting. A bit of a slow process as machine is used during much of the week. (Currently running/bisecting @ 27ef395)
(In reply to comment #6) > (In reply to comment #5) > > Any news? Do you have a full debug level 10 log of that failure? > > Still bisecting. A bit of a slow process as machine is used during much of the > week. > > (Currently running/bisecting @ 27ef395) Bisected more, I think I have a level 10 log now (uploading soon). The error is *much* more intermittent than I thought, which makes this a bit tricky. I *think* the commit is now between: [GOOD] 5d069a04 selftest: Make the wbinfo userinfo tests work properly... and [ BAD] 27ef3957 smbd: Remove an outdated comment There are four commits in that range: https://git.samba.org/?p=samba.git;a=commitdiff;h=f3710320c https://git.samba.org/?p=samba.git;a=commitdiff;h=af7f88721 https://git.samba.org/?p=samba.git;a=commitdiff;h=da3a798 https://git.samba.org/?p=samba.git;a=commitdiff;h=0b77cd96
Created attachment 10137 [details] 50k lines, "log level=10", machine AIO15, domain=CORP
(In reply to comment #8) > Created attachment 10137 [details] > 50k lines, "log level=10", machine AIO15, domain=CORP So this is very likely due to the change to winbindd. For some reason the SID -1133 is being seen as converted to a group ID instead of a user ID. Sorry to send you into another round, but for this we would need to see all the winbind log files (log.w*) with debug level 10. Also, "net cache list | grep 1133" could reveal some valuable information. Thanks a lot! Volker
> So this is very likely due to the change to winbindd. For some reason the SID > -1133 is being seen as converted to a group ID instead of a user ID. Sorry to > send you into another round, but for this we would need to see all the winbind > log files (log.w*) with debug level 10. Also, "net cache list | grep 1133" > could reveal some valuable information. > > Thanks a lot! > > Volker The *current* `net cache list` (running on the *GOOD* 5d069a0): Key: IDMAP/SID2XID/S-1-5-21-1516742906-2559259289-1948011451-1133 Timeout: Mon Jul 28 14:05:39 2014 Value: 3000053:B Key: IDMAP/UID2SID/3000053 Timeout: Mon Jul 28 14:05:39 2014 Value: S-1-5-21-1516742906-2559259289-1948011451-1133 Key: IDMAP/GID2SID/3000053 Timeout: Mon Jul 28 14:05:39 2014 Value: S-1-5-21-1516742906-2559259289-1948011451-1133 I'll attach the log.w* files (they don't show much) and the "log." file.
Created attachment 10145 [details] log. & log.w*
(In reply to comment #11) > Created attachment 10145 [details] > log. & log.w* There isn't really much, right. If you get into this situation, is it repeatable at all? This to me seems a bit like a blocker.
> There isn't really much, right. If you get into this situation, is it > repeatable at all? This to me seems a bit like a blocker. I haven't had a chance to debug more this week -- but OTOH: - This is definitely intermittent - When a machine is affected, it seems to be persistently broken (users can't connect to the AD controller or any shares), though users can logon (presumably with cached credentials) - Affected machines remain broken across multiple reboots - Affected machines sometimes recover after I revert Samba versions -- and then aren't affected later on. e.g. Computer1 breaks on HEAD -> revert to 085c7a7 & Computer1 works -> move back to HEAD and Computer1 still works fine. (This network has 20 ~identical fully patched Win 8.1 machines.) Let me know if there's any additional data I can provide.
This is not a client issue, it is definitely a server issue. If you have a client in such a situation and at that point the problem is repeatable, we need to capture the full call chain. smbd with debug level 10 log, net cache list, winbind with all debug level 10 logs. That's our only chance, sorry.
(In reply to comment #14) > This is not a client issue, it is definitely a server issue. If you have a > client in such a situation and at that point the problem is repeatable, we need > to capture the full call chain. smbd with debug level 10 log, net cache list, > winbind with all debug level 10 logs. That's our only chance, sorry. So this would be `net cache list` plus all the log files w/ a smb.conf of: [global] log level = 10 log file = /ramcache/log.%U Sound about right? Need anything else?
(In reply to comment #15) > (In reply to comment #14) > > This is not a client issue, it is definitely a server issue. If you have a > > client in such a situation and at that point the problem is repeatable, we need > > to capture the full call chain. smbd with debug level 10 log, net cache list, > > winbind with all debug level 10 logs. That's our only chance, sorry. > > So this would be `net cache list` plus all the log files w/ a smb.conf of: > > [global] > log level = 10 > log file = /ramcache/log.%U > > Sound about right? Need anything else? I'm not entirely sure where winbind in your configuration puts its log files, but having all those is essential too. You can bump up the debug level of a running process with "smbcontrol <pid> debug 10". That might be essential here because a winbind restart might fix the problem, not sure.
Finally had a chance to get back to this. This is still occurring on the latest HEAD at: 7bc2e2d0d4a6706769a436798aa29d18a8f7c221 messaging_dgm: Don't expose the messaging_dgm_context master author Volker Lendecke <vl@samba.org> Wed, 10 Sep 2014 14:13:18 +0000 (16:13 +0200) Attaching level 10 logs from 7bc2e2d0d4a + net cache list
Logs are very large at level 10. Full .tar.gz at: http://web.mit.edu/semenko/Public/bug-10720.tar.gz Client machine: aio4 Username: lisa Domain controller: runway
I'm still seeing this as of a git head from yesterday (6c05cd3e89) and it makes samba entirely unusable, with a percentage of clients completely failing to access any domain services ("Security ID structure invalid") and the same samba logs as below. (We remain stuck right before f3710320cef | https://git.samba.org/?p=samba.git;a=commit;h=f3710320cef475ebac561882c8fdaf8e51c8b7c3) I've added some additional logs of this (with varying log levels 5-10, since some are a bit cleaner to browse) here: http://web.mit.edu/semenko/Public/varying-log-levels-bug-10720.tar.gz In addition to the "Unable to convert first SID" error, I also see some attempts to map SID S-0-0, which results in "Could not convert sid S-0-0: NT_STATUS_NONE_MAPPED" while clients are trying map a shared/mapped network drive called "DROPBOX". Happy to debug in any other way, test patches, etc.
Andrew, the commit Nick points at is around the winbind3 work you did. Please take a look.
Hi, I have encountered the same problem too. Is this problem only happen to the latest samba 4 by updating to the latest version through GIT update? Anyone try to downgrade the version to official release version and solve this issues? This issues randomly has users and groups security tab corrupted. Is there any work around? Regards, CQ
Created attachment 10337 [details] Gencache dump I've managed to coincidentally run into what seems to be the same issue. For me it's not the first SID though, and it's for enterprise admins. It seems to be an issue with the cache. My configuration was Samba as the AD-DC and attached is the dump of the cache and to follow are secrets.tdb and idmap.tdb. The suspects in question are the uid to sid functions.
Created attachment 10338 [details] Secrets.tdb dump
Created attachment 10339 [details] idmap.ldb dump Line 42 contains the offending SID for enterprise admins. It's marked here as ID_TYPE_BOTH.
This remains an issue testing with yesterday's head: > b8fce2c65b libcli/smb: use lib/crypto SHA512 functions, do not depend on heimdal. author Günther Deschner <gd@samba.org> Mon, 13 Oct 2014 18:01:05 +0000 (20:01 +0200) committer Günther Deschner <gd@samba.org> Tue, 14 Oct 2014 11:13:08 +0000 (13:13 +0200)
Created attachment 10353 [details] Temporary fix for caching issue This is just a patch which currently disables the use of the idmap cache in some cases. This isn't really ideal generally, but if Nick wouldn't mind seeing if this fixes the issue, it would be much appreciated. Some basic testing on my own machine indicates that the issue no longer occurs with this patch in place. Nick, you'll have to remove gencache.tdb and gencache_notrans.tdb before starting up Samba and checking it works.
Patch applied cleanly to head. Tentatively appears to be working -- will know for sure tomorrow. Thanks Garming!
Looks like this is stable, applied atop > b8fce2c65b.
*** Bug 10957 has been marked as a duplicate of this bug. ***
To clear the invalid cache entry (with or without the patch), use: net cache flush You can see the invalid cache entry with: net cache list (look for SIDs that belong to groups, with type U). We expect to have an improved patch shortly, that avoids us squashing ID_TYPE_BOTH mappings to only UID or GID mappings.
Is this a showstopper for 4.2.0?
This is a 4.2.0 blocker. Garming, Andrew: should we just do with the temporary fix and clean it up later? Are the changes in source3/winbindd/idmap_util.c needed? idmap_backends_unixid_to_sid() may reset map.xid.type = ID_TYPE_BOTH. I think the pdb_interface.c changes are all we need.
fixed in v4-2-test
*** Bug 11062 has been marked as a duplicate of this bug. ***