It turns out that on modern Linux kernels and glibc's, with filesystems such as ext4 and xfs, posix_fallocate() doesn't do linear writing of data when extending a file, but does a much more efficient block allocation algorithm directly in the filesystem code. Unfortunately the vfs_fill_sparse() code that is called from real_write_file() when strict allocate = true doesn't use posix_fallocate like the vfswrap_ftruncate() does, but falls back to using pwrites of 32k. This is much less efficient. Moving this to posix_fallocate() doubled the performance on the Intel NASPT "file copy" benchmark. Patch to follow. I'm marking this as a blocker, as this has such a major performance impact on our major OEM platform in the most common configuration. Jeremy.
Created attachment 6100 [details] Fix for 3.5.7. Volker please review for 3.5.7. I'll get confirmation from the reporter. Jeremy.
Created attachment 6101 [details] Updated fix for 3.5.7 - copes with stream files. Update the patch. Ensure it doesn't call posix_fallocate directly for stream handles. Jeremy.
Comment on attachment 6101 [details] Updated fix for 3.5.7 - copes with stream files. Looks good. I think it mixes two aspects (the FIFO hunk and the rest), but if the customer acknowledges this improves performance, it should go in. Volker
Thanks for the review. Do you want me to split it into 2 patches (the FIFO change and the other) or is it good to go ? Jeremy.
I'm a bit pedantic at times, but not *that* paranoid. I just wanted to drop this message. Have you tried "git add -p"? This makes splitting up changes before git commit really a piece of cake. Volker
No, I haven't tried that - I will take a look. Thanks ! Jeremy.
Can we also change the default of strict allocate to be true in 3.6 ? It would be nice to have this optimization enabled by default.
I agree (on making "strict allocate" the default for 3.6.0. FYI. One minor issue is that this also bypasses the VFS for an on-disk operation - but this is exactly the same issue that using "struct allocate" with ftruncate hits, so it's no worse than we already have. I have fixed this issue correctly for 3.6.0 by moving posix_fallocate into the VFS. Jeremy.
In the above comment s/struct allocate/strict allocate/ :-).
(In reply to comment #2) > Created an attachment (id=6101) [details] > Updated fix for 3.5.7 - copes with stream files. > > Update the patch. Ensure it doesn't call posix_fallocate directly for stream > handles. > Jeremy. > Pushed to v3-5-test. Closing out bug report. Thanks!
(In reply to comment #7) > Can we also change the default of strict allocate to be true in 3.6 ? > It would be nice to have this optimization enabled by default. > Do we need to discuss that on the mailing list first?
a change of strict allocate was discussed for 3.5 already. Jeremy: you were against it and you were definetely right there. In all setups without fast file allocation methods (the majority of setups) timeouts will occur when big files are created and performance will be degredated generally.
That was before I discovered the 2x write speedup :-). I'm going to suggest it on the list to default to "true" for 3.6.0, but I need to do a write-up of how it helps first. Jeremy.