Bug 10604 - Share mount fails on first attempt only during NT token creation
Summary: Share mount fails on first attempt only during NT token creation
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services (show other bugs)
Version: 4.1.7
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Samba QA Contact
QA Contact: Samba QA Contact
URL:
Keywords:
: 10608 (view as bug list)
Depends on:
Blocks:
 
Reported: 2014-05-12 07:10 UTC by David Anderson
Modified: 2021-06-28 17:30 UTC (History)
13 users (show)

See Also:


Attachments
log.smbd at log level=10 of a failed mount attempt (464.60 KB, application/octet-stream)
2014-05-12 07:10 UTC, David Anderson
no flags Details
log.smbd at log level=10 of a successful mount attempt (668.06 KB, application/octet-stream)
2014-05-12 07:12 UTC, David Anderson
no flags Details
Patch to handle -1 UID from AD (376 bytes, patch)
2014-09-09 16:25 UTC, Julien Wallior
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description David Anderson 2014-05-12 07:10:54 UTC
Created attachment 9936 [details]
log.smbd at log level=10 of a failed mount attempt

I have a Samba 4.1.7 server on FreeBSD 10.0-RELEASE-p2, with the following config:

[global]
workgroup = home
server role = standalone
security = user
guest account = nobody
log level = 10

[Tmp]
comment = Go away
path = /data/tmp
read only = no
force user = nobody
force group = nobody
valid users = ingress


I'm trying to mount this from two Linux clients:
 - Ubuntu 12.04, x86_64, kernel 3.2.0-59-generic #90-Ubuntu
 - Arch Linux, x86_64, kernel 3.14.1-1-ARCH #1

On both these clients, I'm mounting with:

mount -t cifs //192.168.209.2/Movies movies/ -o uid=nobody,forceuid,gid=nogroup,forcegid,username=ingress,password=SECRET,noperm

And on both these clients, the *first* run of this command fails with:

mount error(5): Input/output error
Refer to the mount.cifs(8) manual page (e.g. man mount.cifs)

Attempting the mount again immediately after that failure will correctly mount the share. From that point on, there seems to be some caching in play, because I can continue to correctly mount/unmount the share for a few minutes. After that, I get the mount failure again, followed by no problems for a minute or two, and so on.

On the server, with log level = 10, log.smbd reveals the following during the *failed* attempt (truncated to show interesting entries, full log attached) :

[2014/05/11 23:39:08.962654,  2, pid=2356, effective(0, 0), real(0, 0), class=auth] ../source3/auth/auth.c:278(auth_check_ntlm_password)
  check_ntlm_password:  authentication for user [ingress] -> [ingress] -> [ingress] succeeded
[2014/05/11 23:39:08.962670, 10, pid=2356, effective(0, 0), real(0, 0)] ../source3/auth/token_util.c:224(create_local_nt_token_from_info3)
  Create local NT token for ingress
[2014/05/11 23:39:08.962699, 10, pid=2356, effective(0, 0), real(0, 0), class=tdb] ../source3/lib/gencache.c:296(gencache_set_data_blob)
  Adding cache entry with key=[IDMAP/SID2XID/S-1-5-21-250094908-2656553239-839613848-1001] and timeout=[Wed Dec 31 16:00:00 1969 PST] (-1399876748 seconds in the past)
[2014/05/11 23:39:08.965253, 10, pid=2356, effective(0, 0), real(0, 0)] ../source3/passdb/lookup_sid.c:1388(sid_to_uid)
  sid S-1-5-21-250094908-2656553239-839613848-1001 -> uid 4294967295
[2014/05/11 23:39:08.965347,  1, pid=2356, effective(0, 0), real(0, 0)] ../source3/auth/token_util.c:430(add_local_groups)
  SID S-1-5-21-250094908-2656553239-839613848-1001 -> getpwuid(4294967295) failed
[2014/05/11 23:39:08.965363,  3, pid=2356, effective(0, 0), real(0, 0)] ../source3/auth/token_util.c:316(create_local_nt_token_from_info3)
  Failed to finalize nt token
[2014/05/11 23:39:08.965374, 10, pid=2356, effective(0, 0), real(0, 0)] ../source3/auth/auth_ntlmssp.c:45(auth3_generate_session_info)
  create_local_token failed: NT_STATUS_UNSUCCESSFUL


The failure seems to happen in add_local_groups, because it's trying to look up UID 4294967295, which doesn't exist. Note that 4294967295 is -1, if the bit pattern is interpreted as an unsigned 32-bit value.

The same section in the successful mount log shows a SID cache hit, and add_local_groups never logs (no idea whether it's never called or just has nothing to say).

At this point, my lack of knowledge of what *should* be happening with that SID to UID map is blocking me from debugging further. What information can I provide you to help?
Comment 1 David Anderson 2014-05-12 07:12:52 UTC
Created attachment 9937 [details]
log.smbd at log level=10 of a successful mount attempt

Here's a log of a successful mount. Between the failed and the successful mount, I rebooted smbd to get a clean log capture, which is why it logs all the startup stuff again. It's also somewhat interesting in that whatever lets subsequent mount attempts succeed persists across smbd restarts.
Comment 2 Moritz Bunkus 2014-05-29 15:53:58 UTC
A heads up that this happens on Linux as well. For me it presents similarly, but I cannot access the server at all (neither from Windows, nor with »smbclient -U username //servername/share«. I'm running Arch Linux 64bit with a freshly upgraded Samba 4.1.7. My log files look pretty much the same except that the UID looked up is not max int32 but with 3000007 as in the following (log level 3):

[2014/05/29 13:29:13.785722,  3] ../source3/auth/auth.c:177(auth_check_ntlm_password)
  check_ntlm_password:  Checking password for unmapped user [VOIDJUMPER]\[mosu]@[SWEET-CHILI] with the new password interface
[2014/05/29 13:29:13.785772,  3] ../source3/auth/auth.c:180(auth_check_ntlm_password)
  check_ntlm_password:  mapped user is: [SWEET-CHILI]\[mosu]@[SWEET-CHILI]
[2014/05/29 13:29:13.789646,  3] ../source3/auth/auth.c:226(auth_check_ntlm_password)
  check_ntlm_password: sam authentication for user [mosu] succeeded
[2014/05/29 13:29:13.789742,  2] ../source3/auth/auth.c:278(auth_check_ntlm_password)
  check_ntlm_password:  authentication for user [mosu] -> [mosu] -> [mosu] succeeded
[2014/05/29 13:29:13.789814,  3] ../auth/ntlmssp/ntlmssp_sign.c:547(ntlmssp_sign_init)
  NTLMSSP Sign/Seal - Initialising with flags:
[2014/05/29 13:29:13.789856,  3] ../auth/ntlmssp/ntlmssp_util.c:34(debug_ntlmssp_flags)
  Got NTLMSSP neg_flags=0x60088215
[2014/05/29 13:29:13.790424,  1] ../source3/auth/token_util.c:430(add_local_groups)
  SID S-1-5-21-4227480935-2939380808-3766029725-1101 -> getpwuid(3000007) failed
[2014/05/29 13:29:13.790495,  3] ../source3/auth/token_util.c:316(create_local_nt_token_from_info3)
  Failed to finalize nt token
[2014/05/29 13:29:13.790540,  1] ../source3/smbd/sesssetup.c:276(reply_sesssetup_and_X_spnego)
  Failed to generate session_info (user and group token) for session setup: NT_STATUS_UNSUCCESSFUL

If I downgrade the very same installation with the very same configuration file back to 4.1.6 on this system then access starts working immediately – both with smbclient and from Windows.

I can provide log files at higher log levels if necessary, just let me know.
Comment 3 Jeremy Allison 2014-05-31 00:24:58 UTC
It's this code that is failing:

static NTSTATUS add_local_groups(struct security_token *result,
                                 bool is_guest)
{
        gid_t *gids = NULL;
        uint32_t getgroups_num_group_sids = 0;
        struct passwd *pass = NULL;
        TALLOC_CTX *tmp_ctx = talloc_stackframe();
        int i;

        if (is_guest) {
                /*
                 * Guest is a special case. It's always
                 * a user that can be looked up, but
                 * result->sids[0] is set to DOMAIN\Guest.
                 * Lookup by account name instead.
                 */
                pass = Get_Pwnam_alloc(tmp_ctx, lp_guestaccount());
        } else {
                uid_t uid;

                /* For non-guest result->sids[0] is always the user sid. */
                if (!sid_to_uid(&result->sids[0], &uid)) {
                        /*
                         * Non-mappable SID like SYSTEM.
                         * Can't be in any /etc/group groups.
                         */
                        TALLOC_FREE(tmp_ctx);
                        return NT_STATUS_OK;
                }

                pass = getpwuid_alloc(tmp_ctx, uid);
                if (pass == NULL) {
                        DEBUG(1, ("SID %s -> getpwuid(%u) failed\n",
                                sid_string_dbg(&result->sids[0]),
                                (unsigned int)uid));
                }
        }

        if (!pass) {
                TALLOC_FREE(tmp_ctx);
                return NT_STATUS_UNSUCCESSFUL;
        }
...

Note that the problem you're seeing is sid_to_uid(&result->sids[0], &uid) returning a strange uid value for that sid.

There is the failure, now to understand why that is happening..
Comment 4 Jeremy Allison 2014-05-31 00:29:35 UTC
In the failed log case we're getting (inside sid_to_uid()):

[2014/05/12 00:09:43.676002, 10, pid=2563, effective(0, 0), real(0, 0), class=tdb] ../source3/lib/gencache.c:296(gencache_set_data_blob)
  Adding cache entry with key=[IDMAP/SID2XID/S-1-5-21-250094908-2656553239-839613848-1001] and timeout=[Wed Dec 31 16:00:00 1969 PST] (-1399878583 seconds in the past)
[2014/05/12 00:09:43.680604, 10, pid=2563, effective(0, 0), real(0, 0)] ../source3/passdb/lookup_sid.c:1388(sid_to_uid)
  sid S-1-5-21-250094908-2656553239-839613848-1001 -> uid 4294967295

Looks like your user 'ingress' is being mapped to 'nobody' (uid -1).
Comment 5 Jeremy Allison 2014-05-31 00:31:41 UTC
Are you able to rebuild Samba and add debug log messages ? I'd love to know how user 'ingress' ends up with uid -1, and following the code path through that code (sid_to_uid) would help.

The current debug statements don't have a fine enough granularity for me to tell what is going on.
Comment 6 Moritz Bunkus 2014-05-31 08:25:53 UTC
I know you're asking the original poster, but I'm definitely willing to re-compile, too, if that helps. Please add your debug diff and I'll get on it.

In my case the SID that's displayed is the correct SID (checked with »pdbedit -L -v«, it's the SID of the user I'm trying to connect with). That user does have a passwd entry, of course, but there's no UID 3000007 on my system.

If you'd rather I don't continue working on this results from two different people make matters more confusing then that's perfectly alright; just say so and I'll stay out of it.
Comment 7 Andrea Venturoli 2014-05-31 17:46:00 UTC
(In reply to comment #0)
> Created attachment 9936 [details]
> log.smbd at log level=10 of a failed mount attempt
> 
> I have a Samba 4.1.7 server on FreeBSD 10.0-RELEASE-p2, with the following
> config:
> ...

Hello David.

I think I have a similar setup (4.1.8 on FreeBSD 9.2/amd64) and I'm seeing similar problems.

Are you running winbindd? Do you need it? Could you try killing it?
After killing it, I wasn't able to reproduce this problem (although its intermittent nature doesn't allow for certitude).
Unfortunately I can't get rid of it, since I'm authenticating other services to Samba through PAM.

Downgrading to 4.1.7 solves too (but then again, I have "force user" on a share and that won't work anymore).




Back to the original problem, I'm seeing the same error about getpwuid(4294967295).
I can provide logs if it helps.
Comment 8 Andrea Venturoli 2014-05-31 18:13:22 UTC
(In reply to comment #7)

Also, diffing 4.1.6 and 4.1.7 sources, I came up with another workaround: lines 495-498 of source3/auth/token_util.c didn't exist in the older version and commenting them solved every needs of mine (or, at least, I still have to found any downside).

Can't tell the consequences for other setups, though.
A comment from someone with more insight would be very welcome.
Comment 9 David Anderson 2014-05-31 18:38:10 UTC
(In reply to comment #5)
> Are you able to rebuild Samba and add debug log messages ? I'd love to know how
> user 'ingress' ends up with uid -1, and following the code path through that
> code (sid_to_uid) would help.
> 
> The current debug statements don't have a fine enough granularity for me to
> tell what is going on.

I've never built Samba from source on FreeBSD, but I'm willing to give it a try. Did you have a diagnostic patch in mind for me to try, or should I just talk amongst myself and come back with some tracing information for sid_to_uid?
Comment 10 Jeremy Allison 2014-05-31 22:23:45 UTC
(In reply to comment #8)
> (In reply to comment #7)
> 
> Also, diffing 4.1.6 and 4.1.7 sources, I came up with another workaround: lines
> 495-498 of source3/auth/token_util.c didn't exist in the older version and
> commenting them solved every needs of mine (or, at least, I still have to found
> any downside).

Paste the lines you removed here please ! Can't tell exactly what you did without that. Thanks.

Jeremy.
Comment 11 Andrea Venturoli 2014-06-01 09:40:19 UTC
(In reply to comment #10)


> Paste the lines you removed here please ! Can't tell exactly what you did
> without that. Thanks.
> 
> Jeremy.



Here they are (the #if/#endif are mine):


static NTSTATUS finalize_local_nt_token(struct security_token *result,
                                        bool is_guest)
{
        struct dom_sid dom_sid;
        gid_t gid;
        NTSTATUS status;
#if 0
        /* Add any local groups. */

        status = add_local_groups(result, is_guest);
        if (!NT_STATUS_IS_OK(status)) {
                return status;
        }
#endif
        /* Add in BUILTIN sids */

        status = add_sid_to_array(result, &global_sid_World,
                                  &result->sids, &result->num_sids);
        if (!NT_STATUS_IS_OK(status)) {
                return status;
        }
Comment 12 Jeremy Allison 2014-06-02 19:12:16 UTC
(In reply to comment #11)

> Here they are (the #if/#endif are mine):
> 
> 
> static NTSTATUS finalize_local_nt_token(struct security_token *result,
>                                         bool is_guest)
> {
>         struct dom_sid dom_sid;
>         gid_t gid;
>         NTSTATUS status;
> #if 0
>         /* Add any local groups. */
> 
>         status = add_local_groups(result, is_guest);
>         if (!NT_STATUS_IS_OK(status)) {
>                 return status;
>         }
> #endif
>         /* Add in BUILTIN sids */
> 
>         status = add_sid_to_array(result, &global_sid_World,
>                                   &result->sids, &result->num_sids);
>         if (!NT_STATUS_IS_OK(status)) {
>                 return status;
>         }

Ah, thought that might be the case. Unfortunately that's not the solution. You're bypassing the following code, which was explicitly added:

         * Now we must get any groups this user has been
         * added to in /etc/group and merge them in.
         * This has to be done in every code path
         * that creates an NT token, as remote users
         * may have been added to the local /etc/group
         * database.

in order to ensure the user has any local groups they are added to in /etc/group added to their token list. The real issue we need to solve is whu sid_to_uid is coming back with a strange/unmappable uid.
Comment 13 Jeremy Allison 2014-06-02 19:13:23 UTC
(In reply to comment #9)
> (In reply to comment #5)
> > Are you able to rebuild Samba and add debug log messages ? I'd love to know how
> > user 'ingress' ends up with uid -1, and following the code path through that
> > code (sid_to_uid) would help.
> > 
> > The current debug statements don't have a fine enough granularity for me to
> > tell what is going on.
> 
> I've never built Samba from source on FreeBSD, but I'm willing to give it a
> try. Did you have a diagnostic patch in mind for me to try, or should I just
> talk amongst myself and come back with some tracing information for sid_to_uid?

It would help greatly if you could track this down (how sid_to_uid is ending up with -1 for that user) yourself. You'd probably learn a lot too ! :-).

Let me know if you need help with this, but it might take me a while..
Comment 14 Andrea Venturoli 2014-06-03 13:45:11 UTC
(In reply to comment #12)

> You're bypassing the following code, which was explicitly added:
> 
>          * Now we must get any groups this user has been
>          * added to in /etc/group and merge them in.
>          * This has to be done in every code path
>          * that creates an NT token, as remote users
>          * may have been added to the local /etc/group
>          * database.
> 
> in order to ensure the user has any local groups they are added to in
> /etc/group added to their token list. The real issue we need to solve is whu
> sid_to_uid is coming back with a strange/unmappable uid.

I'm aware this works around the real problem; it just let my production server work, but, as I said, I don't expect this one to be a viable solution for everyone.

If you can suggest where to look, what to do, which tests to perform, etc... I'd be happy to help in diagnosing this.
This is a production server, but I can usually mess with it outside work hours (which are luckily quite limited).
Comment 15 Georg Weickelt 2014-06-13 07:13:21 UTC
*** Bug 10608 has been marked as a duplicate of this bug. ***
Comment 16 JonathanWilson 2014-07-26 08:50:05 UTC
(In reply to comment #7)
> (In reply to comment #0)
> > Created attachment 9936 [details] [details]
> > log.smbd at log level=10 of a failed mount attempt
> > 
> > I have a Samba 4.1.7 server on FreeBSD 10.0-RELEASE-p2, with the following
> > config:
> > ...
> 
> Hello David.
> 
> I think I have a similar setup (4.1.8 on FreeBSD 9.2/amd64) and I'm seeing
> similar problems.
> 
> Are you running winbindd? Do you need it? Could you try killing it?
> After killing it, I wasn't able to reproduce this problem (although its
> intermittent nature doesn't allow for certitude).
> Unfortunately I can't get rid of it, since I'm authenticating other services to
> Samba through PAM.

I can confirm that killing winbindd does in fact resolve the problem.

My samba is set up as standalone, version 4.1.9-debian.

Initially I had no problems with the setup, then it seemed to start producing this fault... I can't clarify what I did to cause it or if winbind was running all the time (both no fault and faluting) or if something I did caused winbind to be auto started where it was not initially. I am also fairly sure I at no time did an apt-upgrade (my nvidia driver won't attach to dkms, so no upgrade as it would cause problems)

The problem is replicatable, and when it doesn't happen is also.

In simple terms, issuing a mount to the server after a period of time (couple of mins) causes the problem, a second mount shortly after allows the mount.

The first mount doesn't need to be valid, using an invalid user or using an invalid password cause different errors on the server (NT invalid user, NT invalid password) but a second mount with correct details allows the mount to succeed. A first mount with the correct details fails, a second mount with the correct details works.

If a mount is performed which fails the sid-to-uid and then a longer duration (over about 2 mins) is tried this also fails with the same sid-to-pid error... a third mount within the 2 min window then succeeds.

After looking around, I found reference to wbinfo... I don't know if samba uses the same code (directly or copy, or at compile/runtime) as wbinfo to perform its tests, but...

wbinfo name-to-sid produces a sid.
wbinfo sid-to-name produces a name.
wbinfo sid-to-uid returns -1
wbinfo sid-to-unix-ids failed.

wbinfo --allocate-uid fails to find dc (I think its dc, or something similar, my system is working due to killing winbind, I am loathed to restart it causing network problems again.)
I vaguely recall wbinfo sid-to-fullname also failed (but cannot confirm)

idmap.ldb does not exist on my system.

It should be noted that I only used the original way of adding users. I did not use samba-tool but rather smbpasswd, and my config file also has:-
server role = standalone
security = user
encrypt passwords = true
passdb backend = tdbsam
obey pam restrictions = yes
unix password sync = yes.

My smb.conf was copied from a failed debian wheezy (apt-get autoremove removed my desktop, typical "dumb" end user mistake) over to a clean debian jessie and the only change I made was the inculsion of server role after the problems started, in my hunt for a solution (as previously stated no problems where found initially, but after "something" the double login issue started). I also added a log level = 3; hence knowing about the sid-to-uid fail.

I can't really help further as my system is "working" but thought it might help to provide the details above.
Comment 17 Julien Wallior 2014-09-09 16:25:28 UTC
Created attachment 10272 [details]
Patch to handle -1 UID from AD
Comment 18 Julien Wallior 2014-09-09 16:31:17 UTC
Comment on attachment 10272 [details]
Patch to handle -1 UID from AD

We've had the same issue in our environment. Since we don't have the UID defined in Active Directory, when winbind queries sid-to-uid, AD returns -1. 

We fixed it by handling the -1 return value in the winbind client. I picked WBC_ERR_INVALID_RESPONSE, not sure it's the best value. This patch fixes wbinfo --sid-to-uid and fixes the original error of: 
  SID S-1-5-21-250094908-2656553239-839613848-1001 -> getpwuid(4294967295) failed
Comment 19 hn 2014-09-29 15:26:58 UTC
I see the same behaviour with my Samba 4.1.11 installation.
I am using a Samba group that maps to a linux local group. The samba users (which are local users too) are members of that group. Share security is defined by the group membership.

First attempt to mount (Linux or Android) or browse (Windows, Mac) the shares gives the following client error:

mount error(5): Input/output error
Refer to the mount.cifs(8) manual page (e.g. man mount.cifs)

An immediate retry works and keeps working for a minute or so.

A fix for the winbind Client (as suggested) is no solution to the failure that happens on the server side, as Windows and Mac clients show the same behaviour as Linux winbind clients.

The same exact configuration works with Samba 3.6 in Debian Wheezy, so I will stay on that version until this gets sorted out.

I can post my smb.conf later when I'm home.
Comment 20 hn 2014-09-30 12:45:43 UTC
#======================= Global Settings =======================

[global]
        log file = /var/log/samba/log.%m
        passwd chat = *Enter\snew\s*\spassword:* %n\n *Retype\snew\s*\spassword:* %n\n *password\supdated\ssuccessfully* .
        obey pam restrictions = yes
        map to guest = bad user
        encrypt passwords = true
        passdb backend = tdbsam
        passwd program = /usr/bin/passwd %u
        netbios name = HAUS-SERVER
        unix password sync = yes
        workgroup = SCHWABSTRASSE
        os level = 20
        auto services = global
        security = user
        syslog = 0
        create mode = 775
        preferred master = yes
        usershare allow guests = yes
        panic action = /usr/share/samba/panic-action %d
        max log size = 1000
        directory mode = 775
        pam password change = yes
        #server role = standalone server


#======================= Share Definitions =======================

[homes]
        create mask = 0700
        directory mask = 0700
        browseable = no
        comment = Home Directories
        valid users = %S
        write list = %S

[printers]
   comment = All Printers
   browseable = no
   path = /var/spool/samba
   printable = yes
   guest ok = no
   read only = yes
   create mask = 0700

# Windows clients look for this share name as a source of downloadable
# printer drivers
[print$]
   comment = Printer Drivers
   path = /var/lib/samba/printers
   browseable = yes
   read only = yes
   guest ok = no

# Uncomment to allow remote administration of Windows print drivers.
# You may need to replace 'lpadmin' with the name of the group your
# admin users are members of.
# Please note that you also need to set appropriate Unix permissions
# to the drivers directory for these users to have write rights in it
;   write list = root, @lpadmin


[bilder]
        force user = www-data
        comment = Bilder
        path = /storage/bilder
        write list = @users
        force group = users
        available = yes

and many more shares like this one…
Comment 21 Julien Wallior 2014-10-01 22:20:24 UTC
(In reply to comment #19)
> A fix for the winbind Client (as suggested) is no solution to the failure that
> happens on the server side, as Windows and Mac clients show the same behaviour
> as Linux winbind clients.
Did you try the patch? When I say winbind client, I mean the client code that does sid-to-uid mapping for samba (ie runs on the samba server).
Comment 22 hn 2014-10-02 07:12:59 UTC
(In reply to comment #21)
> (In reply to comment #19)
> > A fix for the winbind Client (as suggested) is no solution to the failure that
> > happens on the server side, as Windows and Mac clients show the same behaviour
> > as Linux winbind clients.
> Did you try the patch? When I say winbind client, I mean the client code that
> does sid-to-uid mapping for samba (ie runs on the samba server).

I see, it looks like we had a misunderstanding. I can try and compile it against 4.1.11 and rebuild the Debian package.
Comment 23 hn 2014-10-02 07:43:56 UTC
(In reply to comment #22)
> (In reply to comment #21)
> > (In reply to comment #19)
> > > A fix for the winbind Client (as suggested) is no solution to the failure that
> > > happens on the server side, as Windows and Mac clients show the same behaviour
> > > as Linux winbind clients.
> > Did you try the patch? When I say winbind client, I mean the client code that
> > does sid-to-uid mapping for samba (ie runs on the samba server).
> 
> I see, it looks like we had a misunderstanding. I can try and compile it
> against 4.1.11 and rebuild the Debian package.

I rebuilt the Debian package and can definitely say, that the proposed patch works. Result are good, no more I/O error messages when mounting a share. Tested against Debian Linux, Windows 8.1, OS X 10.9.5 and an Android Tablet.

I'm sorry for claiming that your patch is no solution, it indeed is! Mea culpa.
Comment 24 zhang.zujian 2014-10-10 07:16:58 UTC
Thanks a lot. The patch works.
Comment 25 Jeremy Allison 2015-01-09 00:47:22 UTC
(In reply to Julien Wallior from comment #18)

I don't think handling this in the client
libraries is the right thing to do.

In the bug report you say:

"We've had the same issue in our environment. Since we don't
have the UID defined in Active Directory, when winbind queries
sid-to-uid, AD returns -1."

What I'd like to see is the debug path when querying
sid-to-uid, and what calls are going on the wire to the
DC, and what is coming back.

Jeremy.
Comment 26 Julien Wallior 2015-01-09 01:48:49 UTC
(In reply to Jeremy Allison from comment #25)
Thanks for the reply. I tried to run winbindd in debug mode, I get this:

accepted socket 18
[18244]: request interface version
[18244]: request location of privileged pipe
sid to uid S-1-5-21-40587175-1795232745-1959552931-41423
child daemon request 59
rpc_api_pipe: host ***
rpc_write_send: data_to_write: 160
rpc_read_send: data_to_read: 216
../librpc/rpc/dcerpc_util.c:140: auth_pad_length 0
Finished processing child request 59

If I try to look at the wire, the traffic to the DC is encrypted. I'm not aware of a way to decrypt it.

If the UID is not set in AD, what other behavior would you expect?
Comment 27 Roger Howard 2015-05-08 16:14:51 UTC
I recently upgraded a machine to Debian 8, which provides Samba version 4.1.17 and I am having exactly the same problem as described in this bug report. It occurs attempting to connect from a Windows 7 machine and also on attempts to connect from the machine itself using smbclient. The file serving problem goes away if I stop the winbindd daemon, but it still is not possible to access a printer share from the Windows machine.

I note that the problem appears to be a call to getpwuid with user -1; that is assuming a 4 byte word, and user nobody has a short integer -1 uid in the passwd file. Would it be possible to use smbusers to make nobody equivalent to root?
Comment 28 Dion Kant 2015-06-09 16:22:14 UTC
Dear all,

It seems, I also have this issue. Recently we moved from Samba 3 to the 4.1.17+dfsg-2 running on Debian 8.

My configuration comprises a Samba server with

server role = classic primary domain controller

and a member server with 

server role = member server

Both servers are configured with ldapsam. I have about 500 users and the "-> getpwuid(1219) failed" messages are found for two situations:

1. If a new user is created this error occurs, this one is repeatable;
2. Randomly, when some existing user tries to login, the account seems to be trapped in this error state.

While an account is in this error state, it can do an authenticated list (smbclient -L) against the PDC but not against the member server. 

Restarting winbind on the member server seems to get the trapped accounts out of the error state.

You guys are even talking about stopping winbind. I find the winbind documentation hard to understand. Winbind does something since I see it creates log entries. 

I would like to fix this issue soon. However, this happens in a production environment and therefor I am thinking of:

1. adding another member server for testing purposes;
2. see if I can repeat the issue on that server;
3. willing to go down the path suggested by Jeremey.
Comment 29 Dmitry 2015-06-16 21:57:47 UTC
I experience the same problem as described. However I have slightly different effect: first login attempt always failed, and second worked fine. Log:

[2015/06/16 23:19:32.246862,  4] ../source3/auth/pampass.c:586(smb_pam_account)
  smb_pam_account: PAM: Account OK for User: dmitry
[2015/06/16 23:19:32.247288,  4] ../source3/auth/pampass.c:465(smb_pam_end)
  smb_pam_end: PAM: PAM_END OK.
[2015/06/16 23:19:32.247369,  2] ../source3/auth/auth.c:278(auth_check_ntlm_password)
  check_ntlm_password:  authentication for user [dmitry] -> [dmitry] -> [dmitry] succeeded
[2015/06/16 23:19:32.700272,  1] ../source3/auth/token_util.c:430(add_local_groups)
  SID S-1-5-21-1901777980-1711874441-2626026913-3000 -> getpwuid(4294967295) failed
[2015/06/16 23:19:32.700451,  3] ../source3/auth/token_util.c:316(create_local_nt_token_from_info3)
  Failed to finalize nt token
[2015/06/16 23:19:32.700507,  1] ../source3/smbd/sesssetup.c:276(reply_sesssetup_and_X_spnego)
  Failed to generate session_info (user and group token) for session setup: NT_STATUS_UNSUCCESSFUL
[2015/06/16 23:19:32.700714,  3] ../source3/smbd/error.c:82(error_packet_set)
  NT error packet at ../source3/smbd/sesssetup.c(279) cmd=115 (SMBsesssetupX) NT_STATUS_UNSUCCESSFUL

The solution was (as suggested above) to stop winbind.

Samba v4.1.17 on Debian 8.
Comment 30 mcoutinho 2015-06-17 09:03:07 UTC
Hi,

I'm using Version 4.1.18-SerNet-RedHat-11.el6.

the command: "wbinfo -i user" returned something like this:
"user:*:4294967295:4294967295:FirstName LastName:/home/user:/bin/false"

Note the IDs... 4294967295:4294967295 same as "getpwuid(4294967295) failed" 

I solved the problem when i add the following to my smb.conf:

  idmap config * : range = 1000-1999999

Just in case i restarted the winbind daemon and after a while the problem disapeard.
 
the command "wbinfo -i user" returned:
"user:*:10001:10001:FirstName LastName:/home/user:/bin/false"

And wbinfo sid_to_uid is NOT ending up with -1, but with the correct uid

Thanks alot, this bug report saved my life...
M.Coutinho
Comment 31 hn 2015-06-17 10:35:14 UTC
(In reply to mcoutinho from comment #30)

Thanks for this observation, but... there is always a but ;-)
Does this also apply for non AD installations?
Comment 32 mcoutinho 2015-06-17 10:56:30 UTC
I'm using Samba as a Windows NT PDC and BDC, so yes.
Comment 33 Frank Rysanek 2015-11-10 12:11:51 UTC
(In reply to mcoutinho from comment #30)

Same symptoms observed by me :-) Relevant versions:

Debian 8 Jessie - about 8.2 level, recently upgraded from Squeeze via Wheezy. 
Samba and winbindd 4.1.17+dfsg-2 . Two such servers actually, with somewhat different upgrade histories and client groups. (On one of the servers, parts of the Samba config file possibly date back to Fedora 5 or some such.)
The whole "windows network" runs in "workgroup mode" here = no domain, no AD.
(I'm starting to wrap my mind around potential tighter centralization.)

The "failed first logon attempt" symptom was especially prominent in DOS and XP.
In DOS it would break logon scripts.

Problem solved by adding the "idmap config * ..." line in the middle of smb.conf, just above the share definitions - followed by a restart of Samba and Winbindd. Voila, problem gone. Flawless operation from DOS through XP / 7 / 8 / 2012 R2. Network drive mapping and printing both work just fine (LPRng as a backend - had to add :jobs_done=0: to printcap entries to avoid some marginal errors). Seems to me that re-attaching network shares after waking up the client from sleep is faster now - but this may be down to an upgraded base OS and down to a migration of the underlying disk volumes to EXT4 and younger drives (= also defragmented).

Thanks everybody for your reporting and debugging.
Comment 34 Neil 2015-12-07 15:42:33 UTC
+1 Please fix.
Comment 35 krichter 2016-01-12 22:25:58 UTC
This seems to be fixed in ldb-1.1.25-258-g9790abd when installed with `./configure --with-systemd --prefix=/usr --enable-fhs --sysconfdir=/etc --localstatedir=/var && make && sudo make install` and default settings on the Ubuntu 15.10 amd64 server side (consider using `checkinstall` when testing this).
Comment 36 Mihai 2016-03-17 17:15:32 UTC
Same problem, on CentOS 7 after i upgraded yesterday.
Stopping winbind worked for me ! :)

 Finding user mcindea
[2016/03/17 13:09:48.382984,  5, pid=5150, effective(0, 0), real(0, 0)] ../source3/lib/username.c:120(Get_Pwnam_internals)
  Trying _Get_Pwnam(), username as lowercase is mcindea
[2016/03/17 13:09:48.382991,  5, pid=5150, effective(0, 0), real(0, 0)] ../source3/lib/username.c:159(Get_Pwnam_internals)
  Get_Pwnam_internals did find user [mcindea]!
[2016/03/17 13:09:48.383159, 10, pid=5150, effective(0, 0), real(0, 0)] ../source3/auth/token_util.c:224(create_local_nt_token_from_info3)
  Create local NT token for mcindea
[2016/03/17 13:09:48.383201, 10, pid=5150, effective(0, 0), real(0, 0)] ../source3/lib/idmap_cache.c:56(idmap_cache_find_sid2unixid)
  Parsing value for key [IDMAP/SID2XID/S-1-5-21-2053067395-845162621-1245804459-421523]: value=[2000:U]
[2016/03/17 13:09:48.383211, 10, pid=5150, effective(0, 0), real(0, 0)] ../source3/lib/idmap_cache.c:75(idmap_cache_find_sid2unixid)
  Parsing value for key [IDMAP/SID2XID/S-1-5-21-2053067395-845162621-1245804459-421523]: id=[2000], endptr=[:U]
[2016/03/17 13:09:48.383218, 10, pid=5150, effective(0, 0), real(0, 0)] ../source3/passdb/lookup_sid.c:1425(sid_to_uid)
  sid S-1-5-21-2053067395-845162621-1245804459-421523 -> uid 2000
[2016/03/17 13:09:48.383288,  1, pid=5150, effective(0, 0), real(0, 0)] ../source3/auth/token_util.c:430(add_local_groups)
  SID S-1-5-21-2053067395-845162621-1245804459-421523 -> getpwuid(2000) failed
[2016/03/17 13:09:48.383317,  3, pid=5150, effective(0, 0), real(0, 0)] ../source3/auth/token_util.c:316(create_local_nt_token_from_info3)
  Failed to finalize nt token
[2016/03/17 13:09:48.383326, 10, pid=5150, effective(0, 0), real(0, 0), class=auth] ../source3/auth/user_krb5.c:274(make_session_info_krb5)
  failed to create local token: NT_STATUS_UNSUCCESSFUL
[2016/03/17 13:09:48.383333,  1, pid=5150, effective(0, 0), real(0, 0)] ../source3/auth/auth_generic.c:127(auth3_generate_session_info_pac)
  Failed to map kerberos pac to server info (NT_STATUS_UNSUCCESSFUL)
[2016/03/17 13:09:48.383355, 10, pid=5150, effective(0, 0), real(0, 0)] ../source3/smbd/smb2_server.c:2696(smbd_smb2_request_error_ex)
  smbd_smb2_request_error_ex: idx[1] status[NT_STATUS_ACCESS_DENIED] || at ../source3/smbd/smb2_sesssetup.c:130
Comment 37 EN 2016-03-22 02:23:12 UTC
Same problem, on debian 3.16.0-4-amd64 running Samba 4.1.17-Debian.
Added ipmap config ... entry (as per above), restarted samba and winbind.
Problem solved.

Thanks for the interim fix.  Really appreciated !!

Sure would be great to have this corrected, it has cost us about a man-day to get a Xerox copier to communicate to a Samba share, which normally would take 5 mins.
Comment 38 Louis 2016-08-09 14:20:50 UTC
in addtion. 

i notice the following. 


>> guest account = nobody 
>> 4294967295, which doesn't exist.


4294967295 in 32bit = user nobody. 

so there is an authentication done here. 

mount -t cifs //192.168.209.2/Movies movies/ -o uid=nobody,forceuid,gid=nogroup,forcegid,username=ingress,password=SECRET,noperm

but with user "nobody" as uid and with username..?? 

Looks to me an incorrect configuration.
Comment 39 Tamás 2016-08-29 09:06:03 UTC
Hi,

I'm running Samba version 4.3.9-Ubuntu, config:

[global]
domain master = yes
local master = yes
preferred master = yes
os level = 65

[munka]
   force user = nagyt
   valid users = nagyt
   comment = munka
   path = /mnt/md0/munka
   browsable = yes
   writable = yes
   printable = no
   create mode = 0600
   force create mode = 0600
   directory mode = 0700
   force directory mode = 0700
   oplocks = no
   level2 oplocks = no
   kernel oplocks = yes


Clients are windows only and only some of them have the same problem when saving large files (typically large files opened in photo manupulation sw and saved here). When problem happens the visible problem is that the application on client is freezeing when save is clicked, so nothing happens.

I checked logs and found this:
[2016/08/29 09:11:13.598281,  2] ../source3/auth/auth.c:305(auth_check_ntlm_password)
  check_ntlm_password:  authentication for user [nagyt] -> [nagyt] -> [nagyt] succeeded
[2016/08/29 09:11:13.598809,  1] ../source3/auth/token_util.c:430(add_local_groups)
  SID S-1-5-21-3930324136-894914035-1992958734-1001 -> getpwuid(4294967295) failed
[2016/08/29 09:11:13.598821,  3] ../source3/auth/token_util.c:316(create_local_nt_token_from_info3)
  Failed to finalize nt token
[2016/08/29 09:11:13.598833,  3] ../auth/ntlmssp/ntlmssp_sign.c:509(ntlmssp_sign_reset)
  NTLMSSP Sign/Seal - Initialising with flags:
[2016/08/29 09:11:13.598837,  3] ../auth/ntlmssp/ntlmssp_util.c:69(debug_ntlmssp_flags)
  Got NTLMSSP neg_flags=0xe2088215
[2016/08/29 09:11:13.598855,  3] ../auth/ntlmssp/ntlmssp_sign.c:509(ntlmssp_sign_reset)
  NTLMSSP Sign/Seal - Initialising with flags:
[2016/08/29 09:11:13.598863,  3] ../auth/ntlmssp/ntlmssp_util.c:69(debug_ntlmssp_flags)
  Got NTLMSSP neg_flags=0xe2088215
[2016/08/29 09:11:13.598977,  3] ../source3/auth/token_util.c:547(finalize_local_nt_token)
  Failed to fetch domain sid for WORKGROUP
[2016/08/29 09:11:13.598989,  3] ../source3/auth/token_util.c:579(finalize_local_nt_token)
  Failed to fetch domain sid for WORKGROUP
[2016/08/29 09:11:13.599990,  3] ../source3/smbd/password.c:144(register_homes_share)
  Adding homes service for user 'nagyt' using home directory: '/home/nagyt'


This happens at 9:11 and the problem was at 9:40 when file saving was tried. It happens randomly.

Question #1: is it possible user can open the file at 9:11 while the auth seems to be unsuccesful?
Question #2: how can I fix it?
Comment 40 Tamás 2016-08-29 11:28:25 UTC
One more thing: it seems the problem happens only with windows 10 clients, for which clients the protocol version is:
Unknown (0x0311)

(See bug https://bugzilla.samba.org/show_bug.cgi?id=11472 which describes protocol names printed incorrectly by smbstatus)

Regards,
/Tamás
Comment 41 Michael Adam 2017-02-22 00:04:27 UTC
From comment #30 and the fact that the proposed patch seems to fix the issue, this looks like winbindd is not reacting fully appropriately to a non-configured id-mapping.

It seems the issue has been fixed differently in master as a side-effect of

https://git.samba.org/?p=samba.git;a=commitdiff;h=fbbe01782096c8a3c47e1d5f4f5e01de047bf9e2

and the corresponding GID-patch

https://git.samba.org/?p=samba.git;a=commitdiff;h=122b1a3650fe9fd6d7cfd8f5497ccb09c6c99406
Comment 42 Julien Wallior 2017-02-27 20:25:35 UTC
(In reply to Michael Adam from comment #41)
I checked and the issue I was running into was fixed by: https://git.samba.org/?p=samba.git;a=commit;h=4d5680e9ae531c6dc4d0a6687abe6293b5d4f4f2

My patch is not required any more.
Comment 43 Hemanth 2017-05-04 23:17:19 UTC
Hello All,

we are using samba 4.3.11 stack. We have started seeing the issue with trusted domain users. Unlike other case mentioned here, IDMAP(autorid) is able to produce a UID within the configured range. But the User lookup seems to be failing.

[2017/05/04 16:05:43.421159, 10, pid=103418, effective(0, 0), real(0, 0), class=winbind] ../source3/winbindd/winbindd.c:731(process_request)
  process_request: Handling async request 155850:GETPWUID
[2017/05/04 16:05:43.421187,  3, pid=103418, effective(0, 0), real(0, 0), class=winbind] ../source3/winbindd/winbindd_getpwuid.c:47(winbindd_getpwuid_send)
  getpwuid 6101174
[2017/05/04 16:05:43.421212, 10, pid=103418, effective(0, 0), real(0, 0), class=winbind] ../source3/winbindd/wb_uid2sid.c:52(wb_uid2sid_send)
  idmap_cache_find_uid2sid found 6101174
[2017/05/04 16:05:43.421234,  1, pid=103418, effective(0, 0), real(0, 0)] ../librpc/ndr/ndr.c:439(ndr_print_function_debug)
       wbint_QueryUser: struct wbint_QueryUser
          in: struct wbint_QueryUser
              sid                      : *
                  sid                      : S-1-5-21-2654446183-594484512-649701735-1174
[2017/05/04 16:05:43.421280,  1, pid=103418, effective(0, 0), real(0, 0)] ../librpc/ndr/ndr.c:439(ndr_print_function_debug)
       wbint_QueryUser: struct wbint_QueryUser
          out: struct wbint_QueryUser
              info                     : *
                  info: struct wbint_userinfo
                      acct_name                : NULL
                      full_name                : NULL
                      homedir                  : NULL
                      shell                    : NULL
                      primary_gid              : 0x0000000000000000 (0)
                      user_sid                 : S-0-0
                      group_sid                : S-0-0
              result                   : NT_STATUS_UNSUCCESSFUL


I haven't any of the listed patches here yet. Since the UID mapping is fine, not sure if these patches are appropriate to fix the issue. I also need to understand the code around the User lookup part.
Comment 44 Björn Jacke 2021-06-28 17:30:58 UTC
resolving as fixed accoring to comment 41 and comment 42