Created attachment 13472 [details] add '--bind-cpu' option to rsync We use rsync to take daily backup or log synchronization, but rsync often trigger high CPU load. I tried to find a solution through Google, but didn't find a satisfactory answer. Many people suggested using the '--whole-file' option to reduce the CPU load, or use 'nice' to lower the rsync execution priority. So I have another idea: Maybe we can add the '--bind-cpu' option to tell rsync to run on specified processor, like the 'worker_cpu_affinity' in nginx. Although I'm not sure that is a good idea, because the core issue is the improvement of the rsync protocol, but I still have made a few attempts. I made some changes to rsync 3.1.2 to enable it to support binding CPUs on GNU/Linux, AIX, FreeBSD & Solaris, please see the attachment. For example, the following option tells rsync to run on CPUs 0, 2-5 and 7: --bind-cpu=0,2-5,7 I also used maketree.py script to test the synchronization of a large number of files (10000 files), and drew some statistics curves (see the cpu_time.pdf in attachment). I was just testing on a machine (AIX 7.2 with 8 processors) without testing the remote synchronization, but I think the conclusion is worth reference. As the size of file increases, the CPU load will continue to increase by default, but the CPU load will slowly increase until it is around 55% when binding a single processor. The disadvantage of binding CPUs is that when file sizes are growing, they are slower to process than non-binding (it's about two times slower in my test result). Of course, if I/O or network bandwidth is the bottleneck, then binding CPUs is not significant because most of time is waiting. Is it necessary to add the '--bind-cpu' option ? Your comments are welcome, happy to answer any question I can !
On Linux you can use taskset (in combination with nice and ionice)...