Bug 5109 - poor performance on large drives with big bandwidth
poor performance on large drives with big bandwidth
Status: NEW
Product: rsync
Classification: Unclassified
Component: core
x86 Windows XP
: P3 normal
: ---
Assigned To: Wayne Davison
Rsync QA Contact
Depends on:
  Show dependency treegraph
Reported: 2007-11-26 18:04 UTC by James Richardson
Modified: 2013-12-25 23:45 UTC (History)
1 user (show)

See Also:


Note You need to log in before you can comment on or make changes to this bug.
Description James Richardson 2007-11-26 18:04:00 UTC
I am attempting to sync 2TB systems with about 400GB of differences across a 1Gb link, however I am finding that rsync has very low CPU and network utilisation for this scenario.

Seeing as how there is such a lot of data to be synced, it would be great if rsync could use a little more than the 4% of the availble bandwidth it is managing!

This is using rsync -a -v --progress rsync://a.b.c.d/xx on cygwin.

This is running on 4 CPU boxes - so we would like to be able to sacrifice performance for a faster sync speed, if that tradeoff is required.
Comment 1 Matt McCutchen 2007-11-26 21:38:45 UTC
Figure out what the bottleneck is.  Since your dataset is enormous, I'm guessing the issue is that rsync is using too much memory and the system is swapping.  If this is the case, you should try the current development rsync, which has an incremental recursion mode that dramatically reduces memory usage.  If you then find that CPU or disk is the bottleneck and the network is underutilized, pass --whole-file to disable the delta-transfer algorithm.
Comment 2 Wayne Davison 2007-12-15 10:39:27 UTC
The latest 3.0.0 pre-release also has improved server-side hashing for really large files.

Have you had a chance to try using -W (--whole-file)?
Comment 3 James Richardson 2007-12-17 06:04:09 UTC
I did try it - the files started being transferred more quickly, however the network utilization was still poor (3-4%).

I don't think that disk write speed was an issue, as i'm writing to a 4 drive hardware raid 5 volume. Neither box was doing anything else at the time, nor had any memory issues... in fact "Task Manager" said mem usage for rsync was quite low (but who knows if it lies)

Comment 4 James Richardson 2007-12-17 06:06:00 UTC
i should also say this is using the cygwin version of rsync - but using rsync protocol not ssh.
Comment 5 roland 2010-08-22 05:22:22 UTC
you should not blame rsync for that, it`s because of the overhead of cygwin.

cygwin is a posix emulation layer on windows and this introduces a lot of overhead. 

maybe this can be tuned performance wise, by doing profiling, optimization and working together with the cygwin folks - but i recommend to try the same thing from within linux so that you can see the difference.
Comment 6 roland 2013-12-25 23:45:04 UTC
besides the fact that this bug is quite old and could perhaps be closed, i recommend retesting with latest rsync on latest cygwin. performance is much better with that