Bug 12964 - Maybe we can add the '--bind-cpu' option
Maybe we can add the '--bind-cpu' option
Status: NEW
Product: rsync
Classification: Unclassified
Component: core
3.1.2
All All
: P5 trivial
: ---
Assigned To: Wayne Davison
Rsync QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2017-08-14 09:12 UTC by Cun Gong
Modified: 2017-08-14 09:12 UTC (History)
0 users

See Also:


Attachments
add '--bind-cpu' option to rsync (53.08 KB, application/x-zip-compressed)
2017-08-14 09:12 UTC, Cun Gong
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Cun Gong 2017-08-14 09:12:01 UTC
Created attachment 13472 [details]
add '--bind-cpu' option to rsync

We use rsync to take daily backup or log synchronization, but rsync
often trigger high CPU load. I tried to find a solution through
Google, but didn't find a satisfactory answer. Many people suggested
using the '--whole-file' option to reduce the CPU load, or use 'nice'
to lower the rsync execution priority. So I have another idea:

  Maybe we can add the '--bind-cpu' option to tell rsync to run
  on specified processor, like the 'worker_cpu_affinity' in nginx.

Although I'm not sure that is a good idea, because the core issue is
the improvement of the rsync protocol, but I still have made a few
attempts. I made some changes to rsync 3.1.2 to enable it to support
binding CPUs on GNU/Linux, AIX, FreeBSD & Solaris, please see the
attachment. For example, the following option tells rsync to run on
CPUs 0, 2-5 and 7:

  --bind-cpu=0,2-5,7

I also used maketree.py script to test the synchronization of a large
number of files (10000 files), and drew some statistics curves (see
the cpu_time.pdf in attachment). I was just testing on a machine (AIX
7.2 with 8 processors) without testing the remote synchronization, but
I think the conclusion is worth reference. As the size of file
increases, the CPU load will continue to increase by default, but the
CPU load will slowly increase until it is around 55% when binding a
single processor. The disadvantage of binding CPUs is that when file
sizes are growing, they are slower to process than non-binding (it's
about two times slower in my test result). Of course, if I/O or
network bandwidth is the bottleneck, then binding CPUs is not
significant because most of time is waiting.

Is it necessary to add the '--bind-cpu' option ? Your comments are
welcome, happy to answer any question I can !