Bug 14095 - Use sendfile = yes broken completely as of bug 13537
Summary: Use sendfile = yes broken completely as of bug 13537
Status: NEW
Alias: None
Product: Samba 4.1 and newer
Classification: Unclassified
Component: File services (show other bugs)
Version: unspecified
Hardware: All All
: P5 regression (vote)
Target Milestone: ---
Assignee: Samba QA Contact
QA Contact: Samba QA Contact
Depends on:
Reported: 2019-08-19 21:10 UTC by Jesse Miller
Modified: 2020-11-30 22:04 UTC (History)
5 users (show)

See Also:

FreeBSD sys_sendfile() fix - patch (2.34 KB, patch)
2019-08-19 21:10 UTC, Jesse Miller
metze: review-

Note You need to log in before you can comment on or make changes to this bug.
Description Jesse Miller 2019-08-19 21:10:04 UTC
Created attachment 15400 [details]
FreeBSD sys_sendfile() fix - patch

Previous to the patch landed in bug 13537, sendfile would cause spinning on EAGAIN/EWOULDBLOCK resulting in excessive CPU usage and poor performance.

The patch in bug 13537 attempted to address this, but it broke the sendfile code almost completely.

On all platforms when EAGAIN or EWOULDBLOCK is encountered, the loop is continue; 'd skipping any accounting of data already sent, resulting in whatever data sent before encountering a blocking situation be re-sent, clients error out on this and the transfer fails.

I will attach a patch, that I am using now on FreeBSD, that works. I can move forward and apply this pattern to the other platforms, but would be limited to testing on FreeBSD and Linux (which appears to be more testing that what was landed in bug 13537).

It is important to note when testing 'use sendfile', you must disable aio as it takes precedence over the sendfile path.

aio read size = 0
aio write size = 0
use sendfile = yes
Comment 1 Jeremy Allison 2019-08-19 21:20:42 UTC
Sorry about the breakage for bug 13537. Testing this is hard as you need to trigger EAGAIN/EWOULDBLOCK.
Comment 2 Jesse Miller 2020-11-25 19:05:25 UTC
Is there is anything I can do to move this forward? I can produce a Linux patch also but do not have access to other systems for testing.
Comment 3 Stefan Metzmacher 2020-11-26 07:22:48 UTC
Comment on attachment 15400 [details]
FreeBSD sys_sendfile() fix - patch

> 		if (nwritten == 0) {
>@@ -457,6 +449,8 @@ ssize_t sys_sendfile(int tofd, int fromfd,
> 			break;
> 		}
>+		twritten += nwritten;
> 		if (sf_header.hdr_cnt) {
> 			if (io_header.iov_len <= nwritten) {
> 				/* Entire header was sent. */
>@@ -474,27 +468,11 @@ ssize_t sys_sendfile(int tofd, int fromfd,
> 		offset += nwritten;
> 		count -= nwritten;
>+		ret = twritten;
> 	}
>-	ret = nwritten;

I think the two logic changed should be separated:
- the nwritten vs. twritten fix
- the set_blocking change

I think the twritten logic is not completely correct.

I guess it should be:

 		offset += nwritten;
 		count -= nwritten;
+               twritten += nwritten;
-	ret = nwritten;
+	ret = twritten;

With that submission in chunks should be fixed, and
maybe it's possible to drop the set_blocking change?
Comment 4 Stefan Metzmacher 2020-11-26 07:30:56 UTC
Is sendfile() the only zero copy method available on FreeBSD?

If possible I'd like to get rid of sendfile support, its api
doesn't really fit into the async non-blocking infrastructure we have.

For Linux we'll move to io_uring with IORING_OP_SPLICE in future,
which makes it possible to decouple the layer violation of the
sendfile path and makes it possible to process the next incoming
pdus while large read responses are transmitted to the client.
Comment 5 Jesse Miller 2020-11-30 22:04:51 UTC
(In reply to Stefan Metzmacher from comment #3)

I would be able to fix the patch in bug 13537 if that works better. It would be the addition of twritten (total) and a fix in the logic around EAGAIN/EWOULDBLOCK accounting.

From my perspective I went with always setting the socket to blocking for the sys_sendfile() duration as it made for net less lines of code and complexity.


On FreeBSD at least I am hesitant to want to remove sendfile() , when sendfile() is working I am getting much more efficient CPU impact.

I don't really like the layer acrobatics getting it to work, but I don't know the history or design around it. Originally after spotting that sendfile has been broken (it either failed or spun on tiny nonblocking sendfile()'s) since the socket change to non-blocking sockets I had made a failed attempt at reworking sendfile across the layers so it could handle partial sends.