Bug 10804 - Problem with aio read size and fcntl_getlock
Summary: Problem with aio read size and fcntl_getlock
Status: NEW
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services (show other bugs)
Version: 4.1.11
Hardware: x64 Linux
: P5 major (vote)
Target Milestone: ---
Assignee: Samba QA Contact
QA Contact: Samba QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-09-09 10:39 UTC by maurer
Modified: 2017-02-21 21:04 UTC (History)
3 users (show)

See Also:


Attachments
strace of smbd (872.47 KB, application/octet-stream)
2014-09-09 12:07 UTC, maurer
no flags Details
wireshark capture of host (53 bytes, text/plain)
2014-09-09 12:23 UTC, maurer
no flags Details
smb.conf (with no domaininfo) (1.40 KB, text/plain)
2014-09-09 12:55 UTC, maurer
no flags Details
strace with sendfile off and aio read size = 1 (450.56 KB, application/x-gzip)
2014-09-09 13:15 UTC, maurer
no flags Details
diff for 4.1.x (1.93 KB, patch)
2014-09-09 17:02 UTC, Jeremy Allison
no flags Details
wireshark capture of host (73 bytes, text/plain)
2014-09-10 08:03 UTC, maurer
no flags Details
strace with sendfile off and aio read size = 1 (74 bytes, text/plain)
2014-09-10 08:03 UTC, maurer
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description maurer 2014-09-09 10:39:58 UTC
Hi

we had a strange problem with  aio read size and fcntl_getlock.
If a Win7 Clients reads a big HDF files, the client
says "Error using hdf5lib2. The HDF5 library encountered an error during execution of the "H5Dread" function, "can't read data".

I took a network trace and and the communication completly stalls for about 30 seconds before the error above.

I tried to capture a debug 10 trace but with debug 10  enabled it works

With debug level 5 I was able to get the following trace and it seems that the process is waiting for an
fcntl_getlock

Skipping user change - already user
[2014/09/09 11:13:31.121911,  3] ../source3/lib/util.c:1177(fcntl_getlock)
  fcntl_getlock: fd 40 is returned info 2 pid 0
[2014/09/09 11:13:31.121957,  4] ../source3/smbd/uid.c:384(change_to_user)
  Skipping user change - already user
[2014/09/09 11:13:31.121997,  3] ../source3/lib/util.c:1177(fcntl_getlock)
  fcntl_getlock: fd 40 is returned info 2 pid 0
[2014/09/09 11:13:31.122042,  4] ../source3/smbd/uid.c:384(change_to_user)
  Skipping user change - already user
[2014/09/09 11:13:31.122081,  3] ../source3/lib/util.c:1177(fcntl_getlock)
  fcntl_getlock: fd 40 is returned info 2 pid 0
[2014/09/09 11:13:31.122127,  4] ../source3/smbd/uid.c:384(change_to_user)
  Skipping user change - already user
[2014/09/09 11:13:31.122166,  3] ../source3/lib/util.c:1177(fcntl_getlock)
  fcntl_getlock: fd 40 is returned info 2 pid 0
[2014/09/09 11:13:31.122217,  4] ../source3/smbd/uid.c:384(change_to_user)
  Skipping user change - already user
[2014/09/09 11:13:31.122256,  3] ../source3/lib/util.c:1177(fcntl_getlock)
  fcntl_getlock: fd 40 is returned info 2 pid 0

[2014/09/09 11:13:50.257918,  5] ../source3/smbd/process.c:2691(housekeeping_fn)
  housekeeping
[2014/09/09 11:13:50.258019,  4] ../source3/smbd/sec_ctx.c:316(set_sec_ctx)
  setting sec ctx (0, 0) - sec_ctx_stack_ndx = 0
[2014/09/09 11:13:50.258044,  5] ../libcli/security/security_token.c:53(security_token_debug)
  Security token: (NULL)

It turns out, that the read succeeds, if I deactivate
aio read size
and it fails, if i set
aio read size = 16384

Regards

Hansjörg
Comment 1 Volker Lendecke 2014-09-09 10:42:32 UTC
Can you provide a network trace and an strace -ttT -f of the affected smbd? Alternatively, a debug level 10 log might also provide insights.

Thanks
Comment 2 maurer 2014-09-09 12:06:55 UTC
Hi

here you can find the wireshark and strace files
The error occours at about packet 38587 (wireshark) and line 66047 (strace)

I tried to get a debug 10, but the error did not occur with high debug levels
(timing issue?)
The highest debug level we could reproduce this error was level 5.
Might activating a certain debug class help?

Regards

hansjörg
Comment 3 maurer 2014-09-09 12:07:45 UTC
Created attachment 10268 [details]
strace of smbd
Comment 4 Volker Lendecke 2014-09-09 12:19:50 UTC
(In reply to comment #3)
> Created attachment 10268 [details]
> strace of smbd

No wireshark trace yet. But I see sendfile calls. I'm not sure sendfile and aio play 100% well together (they should!). Can you try disabling sendfile and see if it helps?
Comment 5 maurer 2014-09-09 12:23:18 UTC
Created attachment 10269 [details]
wireshark capture of host

file will be available for one week.
If you want me to provide it again, let me know
Comment 6 maurer 2014-09-09 12:41:34 UTC
Hi

you are right

disabling "use sendfile" and enabling "aio read size" works.

#        use sendfile = yes
        use sendfile = no
        aio read size = 16384

If I reenable  "use sendfile" the read fails.
        use sendfile = yes
#        use sendfile = no
        aio read size = 16384

Regrads

Hansjörg
Comment 7 Volker Lendecke 2014-09-09 12:49:17 UTC
(In reply to comment #6)
> Hi
> 
> you are right
> 
> disabling "use sendfile" and enabling "aio read size" works.
> 
> #        use sendfile = yes
>         use sendfile = no
>         aio read size = 16384
> 
> If I reenable  "use sendfile" the read fails.
>         use sendfile = yes
> #        use sendfile = no
>         aio read size = 16384
> 
> Regrads
> 
> Hansjörg

Can you post the rest of your smb.conf? What client is this? To me it seems that we fill the TCP buffer pretty quickly. Also, I would recommend "aio read size = 1". Going out for 1 byte to disk should not be any different from going out for 16k overhead-wise. I'd also be interested in another strace with aio read size = 1 and without sendfile. I see some artifacts I would like to understand. First, it seems we only have one aio thread. Then, we try writing to the socket although poll did not tell us the outgoing socket is writable.
Comment 8 maurer 2014-09-09 12:55:25 UTC
Created attachment 10270 [details]
smb.conf (with no domaininfo)
Comment 9 maurer 2014-09-09 12:56:11 UTC
it is an Win7 64 Bit Client
Comment 10 maurer 2014-09-09 12:58:19 UTC
some sysctl Settings of the host

net.core.rmem_max = 8388608
net.core.wmem_max = 8388608
net.ipv4.tcp_rmem = 4096 262144 8388608
net.ipv4.tcp_wmem = 4096 262144 8388608
net.core.netdev_max_backlog=2500
vm.min_free_kbytes = 524288
net.ipv4.tcp_tw_recycle = 1
sunrpc.tcp_slot_table_entries=64
sunrpc.max_resvport=1023
net.ipv4.tcp_sack=1
net.ipv6.conf.all.disable_ipv6 = 1
Comment 11 maurer 2014-09-09 13:00:41 UTC
the server is connected with 10 Gig, the Client 1 Gig
Comment 12 Volker Lendecke 2014-09-09 13:03:27 UTC
(In reply to comment #8)
> Created attachment 10270 [details]
> smb.conf (with no domaininfo)

Ok, no spooky SNDBUF=8192 or so, good :-)
Comment 13 Volker Lendecke 2014-09-09 13:05:58 UTC
(In reply to comment #11)
> the server is connected with 10 Gig, the Client 1 Gig

Without further investigation I'd say that Samba is just sending faster than 1Gig can handle. We'd need to sniff both sides of that connection. But this does not really matter here. The strace points out some inefficiencies though for this case that we might want to fix. Are you happy with disabling aio? Just asking whether we can drop the priority a bit on this :-)
Comment 14 Volker Lendecke 2014-09-09 13:06:15 UTC
(In reply to comment #13)
> (In reply to comment #11)
> > the server is connected with 10 Gig, the Client 1 Gig
> 
> Without further investigation I'd say that Samba is just sending faster than
> 1Gig can handle. We'd need to sniff both sides of that connection. But this
> does not really matter here. The strace points out some inefficiencies though
> for this case that we might want to fix. Are you happy with disabling aio? Just
> asking whether we can drop the priority a bit on this :-)

Ooops, I meant disabling sendfile of course
Comment 15 maurer 2014-09-09 13:15:51 UTC
Created attachment 10271 [details]
strace with sendfile off and aio read size = 1
Comment 16 maurer 2014-09-09 13:18:25 UTC
 with sendfile off and aio read size = 1 
the read fails to

If we disable aio read size it works
We can life without "aio read size" :-)

Regards

Hansjörg
Comment 17 maurer 2014-09-09 13:24:54 UTC
i think at about line 18736 in strace1.out.gz the read stalls
Comment 18 Volker Lendecke 2014-09-09 13:27:21 UTC
(In reply to comment #16)
>  with sendfile off and aio read size = 1 
> the read fails to
> 
> If we disable aio read size it works
> We can life without "aio read size" :-)
> 
> Regards
> 
> Hansjörg

Aio should be rock solid. Investigating.
Comment 19 Volker Lendecke 2014-09-09 13:50:10 UTC
The special thing here is that the client really overwhelms us with read requests, such that we pile up more than 100 of them in the outgoing socket. I can just hope that we don't drop any requests. Trying to reproduce, but this might take a bit.
Comment 20 Volker Lendecke 2014-09-09 14:03:33 UTC
If you are in the position to recompile and play around a bit...

source3/smbd/globals.c has

int aio_pending_size = 100;     /* tevent supports 100 signals SA_SIGINFO */

This limit is no longer relevant. You could increment this to some silly high value such as 1000000 to never run into this limit.
Comment 21 Jeremy Allison 2014-09-09 15:53:18 UTC
Inside vfswrap_init_asys_ctx() we could make the call to asys_context_init() take a max_parallel value of zero, which then allows unlimited underlying threads.

Then all we need to do is remove the if (outstanding_aio_calls >= aio_pending_size) checks inside source3/smbd/aio.c and the aio limits should be removed.

I can hack up a quick patch for master to test ? :-).
Comment 22 maurer 2014-09-09 16:15:27 UTC
Hi

will change io_pending_size  and recompile tomorrow.

Regards

Hansjörg
Comment 23 Jeremy Allison 2014-09-09 17:02:00 UTC
Created attachment 10273 [details]
diff for 4.1.x

You could also try this patch for 4.1.x (and master). It removes all limits on async io driven via pthreads - so smbd will drive the system as hard as the client requests.

Interested in the results :-).

Cheers,

Jeremy.
Comment 24 maurer 2014-09-10 07:38:59 UTC
Hi

we did the test with the patch from above and the follwoing setting
use sendfile = no
aio read size = 1

Unfortunatly the read still fails (not in every test, but in most tests)
I had a attached a wireshark trace
 samba_aio1_sendfile_off.pcap.gz
where the traffic stalls at about packet 6262 
and an strace of the affected smbd
strace_aio1_sendfile_off1.out.gz
where the break occurs at about line 14721

Strange thing is, that in the not working case the smbd, to which the client
connects, seems to be killed.

Regards

Hansjörg
Comment 25 maurer 2014-09-10 08:03:13 UTC
Created attachment 10276 [details]
wireshark capture of host
Comment 26 maurer 2014-09-10 08:03:40 UTC
Created attachment 10277 [details]
strace with sendfile off and aio read size = 1