Bug 7835 - vfs_fill_sparse() doesn't use posix_fallocate when strict allocate is on
vfs_fill_sparse() doesn't use posix_fallocate when strict allocate is on
Status: RESOLVED FIXED
Product: Samba 3.5
Classification: Unclassified
Component: File services
unspecified
Other Linux
: P3 regression
: ---
Assigned To: Karolin Seeger
Samba QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-12-01 16:28 UTC by Jeremy Allison
Modified: 2010-12-06 10:53 UTC (History)
0 users

See Also:


Attachments
Fix for 3.5.7. (2.03 KB, patch)
2010-12-02 14:33 UTC, Jeremy Allison
no flags Details
Updated fix for 3.5.7 - copes with stream files. (2.13 KB, patch)
2010-12-02 17:29 UTC, Jeremy Allison
vl: review+
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jeremy Allison 2010-12-01 16:28:12 UTC
It turns out that on modern Linux kernels and glibc's, with filesystems such as ext4 and xfs, posix_fallocate() doesn't do linear writing of data when extending a file, but does a much more efficient block allocation algorithm directly in the filesystem code.

Unfortunately the vfs_fill_sparse() code that is called from real_write_file() when strict allocate = true doesn't use posix_fallocate like the vfswrap_ftruncate() does, but falls back to using pwrites of 32k. This is much less efficient. Moving this to posix_fallocate() doubled the performance on the Intel NASPT "file copy" benchmark.

Patch to follow.

I'm marking this as a blocker, as this has such a major performance impact on our major OEM platform in the most common configuration.

Jeremy.
Comment 1 Jeremy Allison 2010-12-02 14:33:23 UTC
Created attachment 6100 [details]
Fix for 3.5.7.

Volker please review for 3.5.7. I'll get confirmation from the reporter.

Jeremy.
Comment 2 Jeremy Allison 2010-12-02 17:29:18 UTC
Created attachment 6101 [details]
Updated fix for 3.5.7 - copes with stream files.

Update the patch. Ensure it doesn't call posix_fallocate directly for stream handles.
Jeremy.
Comment 3 Volker Lendecke 2010-12-03 02:53:51 UTC
Comment on attachment 6101 [details]
Updated fix for 3.5.7 - copes with stream files.

Looks good. I think it mixes two aspects (the FIFO hunk and the rest), but if the customer acknowledges this improves performance, it should go in.

Volker
Comment 4 Jeremy Allison 2010-12-03 13:27:07 UTC
Thanks for the review. Do you want me to split it into 2 patches (the FIFO change and the other) or is it good to go ?

Jeremy.
Comment 5 Volker Lendecke 2010-12-03 14:17:34 UTC
I'm a bit pedantic at times, but not *that* paranoid. I just wanted to drop this message. Have you tried "git add -p"? This makes splitting up changes before git commit really a piece of cake.

Volker
Comment 6 Jeremy Allison 2010-12-03 15:03:17 UTC
No, I haven't tried that - I will take a look. Thanks !
Jeremy.
Comment 7 Simo Sorce 2010-12-03 17:00:10 UTC
Can we also change the default of strict allocate to be true in 3.6 ?
It would be nice to have this optimization enabled by default.
Comment 8 Jeremy Allison 2010-12-03 17:35:27 UTC
I agree (on making "strict allocate" the default for 3.6.0. FYI. One minor issue is that this also bypasses the VFS for an on-disk operation - but this is exactly the same issue that using "struct allocate" with ftruncate hits, so it's no worse than we already have. I have fixed this issue correctly for 3.6.0 by moving posix_fallocate into the VFS.
Jeremy.
Comment 9 Jeremy Allison 2010-12-03 17:36:28 UTC
In the above comment s/struct allocate/strict allocate/ :-).

Comment 10 Karolin Seeger 2010-12-05 12:26:22 UTC
(In reply to comment #2)
> Created an attachment (id=6101) [details]
> Updated fix for 3.5.7 - copes with stream files.
> 
> Update the patch. Ensure it doesn't call posix_fallocate directly for stream
> handles.
> Jeremy.
> 

Pushed to v3-5-test.
Closing out bug report.

Thanks!
Comment 11 Karolin Seeger 2010-12-05 12:27:39 UTC
(In reply to comment #7)
> Can we also change the default of strict allocate to be true in 3.6 ?
> It would be nice to have this optimization enabled by default.
> 

Do we need to discuss that on the mailing list first?
Comment 12 Björn Jacke 2010-12-05 17:00:07 UTC
a change of strict allocate was discussed for 3.5 already. Jeremy: you were against it and you were definetely right there. In all setups without fast file allocation methods (the majority of setups) timeouts will occur when big files are created and performance will be degredated generally.
Comment 13 Jeremy Allison 2010-12-06 10:53:30 UTC
That was before I discovered the 2x write speedup :-). I'm going to suggest it on the list to default to "true" for 3.6.0, but I need to do a write-up of how it helps first.
Jeremy.