Bug 8872 - Concurrent connections to sysvol fail in s3fs Configuration
Summary: Concurrent connections to sysvol fail in s3fs Configuration
Status: RESOLVED FIXED
Alias: None
Product: Samba 4.0
Classification: Unclassified
Component: DCE-RPCs and pipes (show other bugs)
Version: 4.0 alpha 18
Hardware: All All
: P5 normal (vote)
Target Milestone: ---
Assignee: Andrew Bartlett
QA Contact: samba4-qa@samba.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-04-17 17:09 UTC by Arvid Requate
Modified: 2012-05-10 07:48 UTC (History)
2 users (show)

See Also:


Attachments
smb.conf, level 10 logs and the test script (750.45 KB, application/x-gzip)
2012-04-17 17:11 UTC, Arvid Requate
no flags Details
Test script for testenv (2.13 KB, text/plain)
2012-04-21 11:49 UTC, Andrew Bartlett
no flags Details
Patch to use generate_random() not random() for unique ID values (1.04 KB, patch)
2012-04-25 08:14 UTC, Andrew Bartlett
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Arvid Requate 2012-04-17 17:09:28 UTC
Samba4 with s3fs configuration currently does has issues handling concurrent login attempts, e.g. against the sysvol share. A shell-script run of 10 backgrounded smbclient processes usually shows at least two NT_STATUS_IO_TIMEOUT failure situations. Adding the "-k no" swith increases the failure rate significantly, only the first smbclient connection succeeds in that case.

root@dc:/home/sernet# ./client-login.sh 
OK
OK
OK
OK
OK
OK
OK
OK
Connection to \\dc.samba.private\sysvol failed - NT_STATUS_IO_TIMEOUT
Connection to \\dc.samba.private\sysvol failed - NT_STATUS_IO_TIMEOUT
Again without trying kerberos:
root@dc:/home/sernet# OK
Connection to \\dc.samba.private\sysvol failed - NT_STATUS_IO_TIMEOUT
Connection to \\dc.samba.private\sysvol failed - NT_STATUS_IO_TIMEOUT
Connection to \\dc.samba.private\sysvol failed - NT_STATUS_IO_TIMEOUT
Connection to \\dc.samba.private\sysvol failed - NT_STATUS_IO_TIMEOUT
Connection to \\dc.samba.private\sysvol failed - NT_STATUS_IO_TIMEOUT
Connection to \\dc.samba.private\sysvol failed - NT_STATUS_IO_TIMEOUT
Connection to \\dc.samba.private\sysvol failed - NT_STATUS_IO_TIMEOUT
Connection to \\dc.samba.private\sysvol failed - NT_STATUS_IO_TIMEOUT
Connection to \\dc.samba.private\sysvol failed - NT_STATUS_IO_TIMEOUT
Comment 1 Arvid Requate 2012-04-17 17:11:55 UTC
Created attachment 7460 [details]
smb.conf,  level 10 logs  and the test script

It looks a bit like windind having a problem answering concurrent wbc_sids_to_xids calls:

[2012/04/17 18:54:05.892046,  5, pid=2385, effective(0, 0), real(0, 0)] ../source4/libcli/wbclient/wbclient.c:72(wb
c_sids_to_xids_send)
  wbc_sids_to_xids called
[...]
[2012/04/17 18:54:15.899285, 10, pid=2385, effective(0, 0), real(0, 0)] ../source3/lib/events.c:216(run_events_poll)
  Running timed event "tevent_req_timedout" 0x8660490
[2012/04/17 18:54:15.899431,  5, pid=2385, effective(0, 0), real(0, 0)] ../source4/libcli/wbclient/wbclient.c:118(wbc_sids_to_xids_recv)
  wbc_sids_to_xids_recv called
[2012/04/17 18:54:15.899483,  1, pid=2385, effective(0, 0), real(0, 0)] ../source3/smbd/sesssetup.c:264(reply_sesssetup_and_X_spnego)
  Failed to generate session_info (user and group token) for session setup: NT_STATUS_IO_TIMEOUT
Comment 2 Andrew Bartlett 2012-04-21 11:49:01 UTC
Created attachment 7472 [details]
Test script for testenv

this reproduces trivially with

SELFTEST_TESTENV=plugin_s4_dc make testenv

and the attached (modified) test script
Comment 3 Andrew Bartlett 2012-04-25 08:14:42 UTC
Created attachment 7501 [details]
Patch to use generate_random() not random() for unique ID values

This fixes the issue, and is in autobuild.
Comment 4 Stefan Metzmacher 2012-04-25 09:36:57 UTC
(In reply to comment #3)
> Created attachment 7501 [details]
> Patch to use generate_random() not random() for unique ID values
> 
> This fixes the issue, and is in autobuild.

I think we should always fill in

id.pid = getpid();
id.task_id = generate_random();
id.vnn = NONCLUSTER_VNN;
id.unique_id = generate_random();
Comment 5 Matthias Dieter Wallnöfer 2012-05-10 07:48:06 UTC
f10c63810077a6759a9df4e9c653066f9f355d96 seems to have fixed this.