Bug 10322 - Slow Performance over Network rsync
Slow Performance over Network rsync
Status: ASSIGNED
Product: rsync
Classification: Unclassified
Component: core
3.1.0
All All
: P5 normal
: ---
Assigned To: Wayne Davison
Rsync QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-12-13 12:25 UTC by Jörg Grube
Modified: 2013-12-26 10:41 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jörg Grube 2013-12-13 12:25:43 UTC
i use rsync under cygwin.
The command is
rsync --recursive --times --stats
 --link-dest="/cygdrive/d/backup/20130101"
 "//Server/data/"
 "/cygwin/d/20130102"

rsync 3.1.0 runs with 2-3 mb/sec
rsync 3.0.9 runs with 2-3 mb/sec
but
rsync 3.0.8 runs with 45 mb/sec

Both Computers are connected over Gig Ethernet and a normal coppy session runs with 50mb/sec.

With rsync 3.1 and 3.0.9 the CPU and the Network are near idle.

thx for help
Comment 1 roland 2013-12-14 13:24:46 UTC
>rsync 3.0.8 runs with 45 mb/sec

45mb/s with cygwin?
are you really sure?

afaik, rsync via cygwin always is dead slow, especially in "local mode", which you are using. (as you read from a network share via unc path)
Comment 2 Jörg Grube 2013-12-17 19:13:04 UTC
It is true. i realize a Speed of 20-25 mb/s with rsync Version 3.0.8.
Any newer Version transfers only 2-5 mb/s. i run my tests on several Computers and as long as i stay with version 3.0.8 everything is ok. but i want a future-save solution for my backup script. i want cygwin in 64 bit and i want rsync in Version 3.1 or better.
Comment 3 roland 2013-12-19 20:33:05 UTC
does cygwin have strace ?  i would do two identical transfers, one with rsync 3.0.8 and one with 3.0.9 being started under control of strace

you may get a clue if there is a difference.

strace -f -F -tt -T rsync ......
Comment 4 Wayne Davison 2013-12-23 18:46:31 UTC
If 3.0.8 is that much faster, I'd imagine that it is due to how long ago it was complied because there isn't that much different between 3.0.8 and 3.0.9 (just a few bug fixes).

I was doing some copying under windows recently, and I saw some weird things: I ran one local-disk to 2nd-local-disk copy and was only getting about 2MB/sec.  I started another, and it too was getting the exact same speed *without* affecting the first copy.  I added 2 more with exactly the same results.  Then, something really interesting happened:  2 of the copies were still going when I started up a disk utility tool to look at the partitioning, and both of the already-running rsyncs started pounding the disks, with their throughput going up by an order of magnitude or so (each).  This makes me think that there is some kind of a per-process limit that is being imposed, and the disk utility somehow turned it off.

Perhaps 3.0.8 was compiled with an older cygwin that does not obey this per-process limit?
Comment 5 roland 2013-12-24 01:28:45 UTC
i installed 64 bit cygwin on my windows8 system and compared the cygwin builtin rsync 3.0.9 to self compiled rsync 3.1.0 and 3.0.8. surprisingly, both 3.0.9 and 3.1.0 are fast (~30mb/s) BUT(!) 3.0.8 is about half as fast (~15mb/s). i`m using a fast ssd, though so i`m not sure how well it performs with ordinary hdd. i never heard of any throttling mechanism in windows or cygwin, but all our observations are weird indeed and need explanation. i also have no clue what`s going on here.
Comment 6 roland 2013-12-24 11:20:56 UTC
latest rsync git is also fast, rsync 3.0.7 is also slow. so, there must be some critical change between 3.0.8 and 3.0.9 which leds to differences in speed, but apparently it "depends" how this change becomes manifest.
Comment 7 roland 2013-12-24 12:50:01 UTC
i found that _my_ performance differences between 3.0.8 and 3.0.9 are being caused by this commit:

https://git.samba.org/?p=rsync.git;a=commitdiff;h=4c0055ecbb51f7fbc878229ec615f93e1adc506f

when adding that ifdefs to 3.0.8 it is as fast as 3.0.9
Comment 8 roland 2013-12-24 13:26:50 UTC
now things getting really crazy.

if i _remove_ the mentioned commit from rsync 3.1.0,  it gives a speedup from ~30mb/s to about 60 mb/s !!!

so with the transition from 3.0.8 to 3.0.9 avoiding usage of socketpair() on cygwin gives a speed benefit , but forcing usage of socketpair() with 3.1.0 by removing the cygwin ifdefs nearly doubles speed.  

i double-checked. scratching my head now.....



w/o socketpair():
test@roterpc /cygdrive/c/Users/test/Downloads
$ rsync -av --progress android-x86-4.3-20130725.iso tmp/
sending incremental file list
android-x86-4.3-20130725.iso
    208,666,624 100%   32.29MB/s    0:00:06 (xfr#1, to-chk=0/1)

sent 208,717,687 bytes  received 35 bytes  27,829,029.60 bytes/sec
total size is 208,666,624  speedup is 1.00


with socketpair():

test@roterpc /cygdrive/c/Users/test/Downloads
$ rsync -av --progress android-x86-4.3-20130725.iso tmp/
sending incremental file list
android-x86-4.3-20130725.iso
    208,666,624 100%   59.70MB/s    0:00:03 (xfr#1, to-chk=0/1)

sent 208,717,687 bytes  received 35 bytes  37,948,676.73 bytes/sec
total size is 208,666,624  speedup is 1.00
Comment 9 Linda Walsh 2013-12-24 21:42:17 UTC
(I hate 1-way mail gateways)
(In reply to comment #2)
> 45mb/s with cygwin?
> are you really sure?
>
> afaik, rsync via cygwin always is dead slow, especially in "local mode", which
> you are using. (as you read from a network share via unc path)
>   
----
I get 400+MB read and write over cygwin to/from remote sockets:

/h> iotest
R:512+0 records in
512+0 records out
4294967296 bytes (4.3 GB) copied, 10.5159 s, 408 MB/s
W:512+0 records in
512+0 records out
4294967296 bytes (4.3 GB) copied, 8.29549 s, 518 MB/s
/h> iotest
R:512+0 records in
512+0 records out
4294967296 bytes (4.3 GB) copied, 10.6297 s, 404 MB/s
W:512+0 records in
512+0 records out
4294967296 bytes (4.3 GB) copied, 9.42769 s, 456 MB/s

----
Any slowdowns are coming from client apps (likely doing too small of read/write
operations).  If you use small io buffers (4K, you can slow that transfer speed down
by a factor of 100.  (Mozilla is has been great at this using 4K I/O blocks for
local file RW and network operations).
The above READ is reading from /dev/zero on the linux machine and writing
to /dev/null on the local windows machine.

The above WRITE is writing to a /dev/null on the remote machine and reading
from a /dev/zero on the win machine.   I created files 'null' and 'zero' which are
the device files '/dev/null and /dev/zero' (named null and zero) in my home directory
to access the devices.  I use the cygwin /dev/null & /dev/zero on the win side as I am
using dd.

So if someone wants to talk about "slow" windows-shares -- it's likely the
application being used.

Those messages come from 'dd'.

My test script I'll attach below.
------------------------------------------------------
#!/bin/bash -u
_prgpth="${0:?}"; _prg="${_prgpth##*/}"; _prgdr="${_prgpth%/$_prg}"
[[ -z $_prgdr || $_prg == $_prgdr ]] && $_prgdr="$PWD"
export PATH="$_prgdr:$_prgdr/lib:$PATH"
shopt -s expand_aliases extglob sourcepath ; set -o pipefail

#include stdalias

Dd=$(type -P dd)

[[ $Dd ]] || { echo "Cannot find dd.  Cannot proceed.";  exit 1; }

alias intConst=declare\ -ix int=declare\ -i my=declare
alias string=declare sub=function array=declare\ -a
# 1 num = block size
# num-num = range of block sizes to test; w/increment = "2x", so
# 4M-16M tests 4M, 8M, 16M
# 4M-12M test 4M, 8M, 12M
# count adjusted to xfer 4G, rounding up
#----

#all xfers are using 'devices' (/dev/zero for source, /dev/null for target)
# remote filenames "zero" and "null" should be setup to be remote devices

intConst K=1024
intConst M=$[K*K]
intConst G=$[M*K]
intConst T=$[G*K]

int BS=$[8*M]
int count=512
int IOSIZE=${IOSIZE:-4*G}

#        desuffix     1st arg = num+suffix -> convert to int
#                            2nd arg = optional buff name (else print to stdout)
#                            return 0 if no error

sub desuffix {                        #convert num+Suff => int store in optional Buff
   string str="${1:?}" ; shift
   string bufnam=""; (($#)) && bufnam=$1
   if [[ $p =~ ^([0-9]+)([KMGT])$ ]]; then
       int num=${BASH_REMATCH[1]}*${BASH_REMATCH[2]}
       ((num)) || return 1
       if [[ $bufnam ]] ; then printf -v $bufnam "%d" "$num"
       else printf "%d" "$num" ; fi
   else
       return 1
   fi
}




sub hdisp {
   int num=${1:?}; shift
   string bufnam=""; (($#)) && bufnam=$1
   string suf=""
   int ans=num
   array pows=('K' 'M' 'G' 'T')
   for s in ${pows[@]};do
       int si=$s
       if (((num/si)*si==num)); then
           ans=num/si;suf="$s"
       fi
   done
}

sub check_params {
   int num=0
   if (($#)) ; then
       string p="$1"; shift;
       if [[ $p =~ ([0-9]+)([KMGT]) ]]; then
           num=${BASH_REMATCH[1]}*${BASH_REMATCH[2]}
       fi
   fi
   ((num)) && {
       BS=num
       count=IOSIZE/BS
   }
}

(($#)) && check_params "$@"

array reada=(/h/zero /dev/null)
array writea=(/dev/zero /h/null conv=nocreat,notrunc)

sub dd_need_io  {
   local if="$1" of="$2"; shift 2
   nice --19 $Dd if="$if" of="$of" bs="$BS" count="$count"\
            oflag=direct iflag=direct conv=nocreat "$@"
}

sub dd {
   local if="$1" of="$2" ;shift 2
   #echo $dd if="$if" of="$of" bs="$BS" count="$count" "$@" >&2
   array out err
   readarray err < <( \
       readarray out < <(dd_need_io "$if" "$of" "$@";int s=$?;
                                           ((s)) && echo "stat:$s">&2  ) 2>&1 )
   #
   if ((${#err[@]})) ;then echo "${err[@]}"; exit 1; fi
   return 0
}
  function dd_format { my ln
   while read ln;do
       echo $ln | while read bytes btxt pnum suffp \
                                           copt time unitc rate ra_unit; do
           [[ $bytes == records ]] && continue
           [[ ${pnum:0:1} != \( || ${suffp:0-1:1} != \) ]] && continue
           num="${pnum:1}"
           suff="${suffp%?}"
           unit="${unitc%?}"
           printf "%s:%s:%s:%s:%s:%s:\n"    "$num" "$suff" "$time" "$unit" "$rate" "$ra_unit"
       done
   done
}

sub onecycle {
   echo -n "R:"; { dd "${reada[@]}"|| exit $?; } | dd_format
   echo -n "W:";    { dd "${writea[@]}" || exit $?; } | dd_format
}

onecycle
Comment 10 Linda Walsh 2013-12-24 21:43:04 UTC
(In reply to comment #4)
> Perhaps 3.0.8 was compiled with an older cygwin that does not obey this
> per-process limit?

---
If it is per/process, it isn't in cygwin, since the 400MB/s rates are
using "dd" in cygwin (talking to samba on linux).
Comment 11 Linda Walsh 2013-12-24 21:47:56 UTC
samba-bugs@samba.org wrote:
> https://bugzilla.samba.org/show_bug.cgi?id=10322
>
> --- Comment #5 from roland <devzero@web.de> 2013-12-24 01:28:45 UTC ---
> i installed 64 bit cygwin on my windows8 system and compared the cygwin builtin
> rsync 3.0.9 to self compiled rsync 3.1.0 and 3.0.8. surprisingly, both 3.0.9
> and 3.1.0 are fast (~30mb/s) BUT(!) 3.0.8 is about half as fast (~15mb/s). i`m
> using a fast ssd, though so i`m not sure how well it performs with ordinary
> hdd. i never heard of any throttling mechanism in windows or cygwin, but all
> our observations are weird indeed and need explanation. i also have no clue
> what`s going on here.
>   
===============
What's going on is you are using something designed for remote comparisons over
a slow link:

/h> time rsync --progress --stats 128M 128Ma
128M
  134217728 100%    7.15MB/s    0:00:17 (xfer#1, to-check=0/1)

Number of files: 1
Number of files transferred: 1
Total file size: 134217728 bytes
Total transferred file size: 134217728 bytes
Literal data: 134217728 bytes
Matched data: 0 bytes
File list size: 19
File list generation time: 0.005 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 134234177
Total bytes received: 31

sent 134234177 bytes  received 31 bytes  7255903.14 bytes/sec
total size is 134217728  speedup is 1.00
18.31sec 0.17usr 0.14sys (1.69% cpu)

time bash -c "cp 128M 128Ma; sync"
1.98sec 0.21usr 0.66sys (44.58% cpu)
time dd if=128M of=128Ma bs=128M oflag=direct
1+0 records in
1+0 records out
134217728 bytes (134 MB) copied, 0.801389 s, 167 MB/s
1.00sec 0.03usr 0.20sys (23.30% cpu)

----


      Notice rsync takes 9 times as long as cp and 18 times as long
      as dd.

If I specify a 4k blocksize with "dd":
time dd if=128M of=128Ma bs=4k oflag=direct
32768+0 records in
32768+0 records out
134217728 bytes (134 MB) copied, 10.5946 s, 12.7 MB/s
10.81sec 0.17usr 1.91sys (19.31% cpu)
-----------
A 2k blocksize:
time dd if=128M of=128Ma bs=2k oflag=direct
65536+0 records in
65536+0 records out
134217728 bytes (134 MB) copied, 19.1194 s, 7.0 MB/s
19.31sec 0.28usr 3.78sys (21.06% cpu)

==============
All of the above were conducted on Win7x64 using cygwin32.
Comment 12 Linda Walsh 2013-12-25 14:18:46 UTC
(In reply to comment #7)
> i found that _my_ performance differences between 3.0.8 and 3.0.9 are being
> caused by this commit:
> 
> https://git.samba.org/?p=rsync.git;a=commitdiff;h=4c0055ecbb51f7fbc878229ec615f93e1adc506f
> 
> when adding that ifdefs to 3.0.8 it is as fast as 3.0.9
------------

That commit is from 2 years ago. Maybe something has changed in cygwin in that time?  This is for local->local copy?

What speed does 'cp' get?

Might it not be faster if doing a local copy not to use a socket but 
to use a shared memory mapped file or shared memory?
Comment 13 roland 2013-12-25 20:58:09 UTC
since this bugreport is about local->local rsync (from the view of rsync copying to //path this is a local transfer) i assume this is some cygwin related issue regarging how rsync handles local transfers on cygwin. there must have been some change in cygwin, imho.
Comment 14 Wayne Davison 2013-12-25 22:24:02 UTC
I have verified that restoring the use of socketpair() (instead of pipe()) is indeed faster under cygwin.  I have also seen reports of fewer hangs in the most modern cygwin libs (and also that a past version seemed to make socketpair less buggy).

I've checked in a change to configure.ac to stop the ignoring of socketpair() on cygwin.  Let me know if that helps your speed tests.
Comment 15 roland 2013-12-25 22:53:30 UTC
i`m not sure if it`s good to revert a "bugfix" - wouldn`t it be possibly better to add skipping usage of socketpair() as an cygwin specific commandline option? so that would make it easier to workaround problems if socketpair() still causing issues.
Comment 16 Wayne Davison 2013-12-26 00:39:28 UTC
This was never really a bugfix, but instead of kluge due to defects in cygwin -- one that does not appear to be needed anymore.
Comment 17 roland 2013-12-26 10:41:49 UTC
seems you are right:
http://victorwyee.com/cmd/solving-rsync-hangs-in-cygwin-via-script-or-recompiling/
..."I still encounter the occasional hang with this rsync 3.1.0 with socketpairs – but not as many as before"....