The pending diff adds pread/pwrite operations to the VFS layer and makes use of them in the I/O path. The profiling stats are updated correspondingly. There is not much performance advanage additional to using spinlocks (a consistent 2% - 7% increase in throughput), but there is a significant benefit to using p{read,write} with fcntl tdb locking (5% - 30% increase in throughput). The largest improvements are in high packet rate workloads (ie, small blocks sizes and meta-data workloads), as you might expect. I can provide detailed numbers (packet rate, cpu usage, NIC throughput, etc) if that would be useful.
Created attachment 325 [details] add pread/pwrite vfs ops
Slightly modified version of patch applied for 3.0.2. Jeremy.
sorry for the same, cleaning up the database to prevent unecessary reopens of bugs.