Bug 9783 - please don't use client-server model for local copies
please don't use client-server model for local copies
Status: NEW
Product: rsync
Classification: Unclassified
Component: core
3.0.9
All Linux
: P5 enhancement
: ---
Assigned To: Wayne Davison
Rsync QA Contact
http://lwn.net/Articles/400489/
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-04-10 19:22 UTC by Marc Haber
Modified: 2013-12-02 23:01 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Marc Haber 2013-04-10 19:22:54 UTC
Hi,

rsync is quite slow when compared to cp while copying from a local disk to another local disk. rsync is not made for that situation, but still a common use case. On linux, this setup misleads CPU frequency governors which causes CPUs to reduce their operation frequency, resulting in bad performance. On my test system, rsync copies from disk to disk with about 30 Mbytes/s, while a simple cp delivers 130 Mbytes/s.

There are shell scripts around that use rsync --dry-run to find out the list of files that need to be copy and then feed that file list to cp to get better performance, which is a really really ugly hack.

Please consider detecting a local operation and not using the client-server model here, but instead using a more stupid algorithm like the one that cp uses. That would make rsync incredibly more useful in the quite common case of local operation.

Greetings
Marc
Comment 1 Kilian CAVALOTTI 2013-05-02 14:48:00 UTC
One way to easily reproduce this problem is to try to transfer a single large file using rsync:

a. over ssh: 
  rsync -av host0:/path/to/file /tmp/file

b. locally, by mounting the remote path with sshfs: 
  sshfs host0:/path/to/ /path/to/
  rsync -av /path/to/file /tmp/file

Given any set of SSH options, b. gives about 60% of a.'s performance.
Comment 2 roland 2013-05-03 18:31:50 UTC
what result does remote scp vs local-copy-on-sshfs give (i.e what is the impact of sshfs) ?
if that gives similar performance, your comparison is proably somewhat valid - otherwise you comparing apples to pears
Comment 3 Kilian CAVALOTTI 2013-05-03 20:12:17 UTC
(In reply to comment #2)
> what result does remote scp vs local-copy-on-sshfs give (i.e what is the impact
> of sshfs) ?

They are the comparable, the impact of sshfs is minimal. I used the same SSH options to conduct the test over a 10GbE link, and checked the speed of the transfer using the NIC counters. I can't reproduce the tests right now, but on top of my head, the numbers were:
- pure scp: about 130MB/s
- cp /sshfsmount/path/to/file /tmp/file: about 130MB/s
- rsync host0:/path/to/file /tmp/file: about 130MB/s
- rsync /sshfsmount/path/to/file /tmp/file: about 80MB/s
Comment 4 Kilian CAVALOTTI 2013-05-13 12:46:47 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > what result does remote scp vs local-copy-on-sshfs give (i.e what is the impact
> > of sshfs) ?
> 
> They are the comparable, the impact of sshfs is minimal. I used the same SSH
> options to conduct the test over a 10GbE link, and checked the speed of the
> transfer using the NIC counters. I can't reproduce the tests right now, but on
> top of my head, the numbers were:

Precisely:
1. [host1 ~]$ scp host0:/path/to/file /path/to/file        : 175 MB/s
2. [host1 ~]$ cp /sshfs/path/to/file /path/to/file         : 120 MB/s
3. [host1 ~]$ rsync -av host0:/path/to/file /path/to/file  : 174 MB/s
3. [host1 ~]$ rsync -av /sshfs/path/to/file /path/to/file  :  90 MB/s

The test file is a 36GB file, generated from user data.
SSH options are the same in all cases, and use the arcfour cipher.
All rates values measured with bwm-ng (http://www.gropp.org/?id=projects&sub=bwm-ng) during the steady phase of the transfer. Destination file has been removed between each test, and the buffer cache has been cleaned with "echo 3 > /proc/sys/vm/drop_caches).

rsync's behavior is definitely different when using a local sshfs mountpoint as a source rather than copying from a remote server.
Comment 5 m8r-28l5q21 2013-12-02 23:01:59 UTC
I can only chime in on this bug.
I'm using rsync on my ReasyNAS 102, which as an ARM core (Marvell Armada 370). Unfortunately, rsync is _very_ slow locally. The constant checksumming is responsible.

For example :
1) rsync -Pi --protocol=29 speed is around 21MB/s
2) rsync -Pi --protocol=30 speed is around 18MB/s
3) cp speed is around 70MB/s


The speed for rsync difference can be explained by MD4 being faster than MD5.

So on slow CPUs the checksumming is a clear bottleneck and it would be nice to have a --no-checksum or --checksum=crc32 algorithm to avoid this problem.