The Samba-Bugzilla – Bug 6378
Erroneous progress output due to ssh send buffer
Last modified: 2009-05-28 09:30:35 UTC
rsync -v --inplace --progress --rsh=ssh -a jing-20081028-1mdv2010.0.src.rpm firstname.lastname@example.org:
The transferred size almost immediately jumps to 2129920 bytes which is completely unrealistic given that I have an ADSL connection with an upstream of at most 25 KBytes/second. Afterwards it hangs after transferring 262,144 bytes, and I constantly need to restart it. But please report the actual bytes transferred over the connection rather than some unrealistic value.
I'm using --inplace so I can resume interrupted connections. (My ISP connection has become flaky lately).
I'm using rsync-3.0.6-1mdv2010.0 on Mandriva Cooker.
As stated in the man page description of --progress, the size shown is the amount of the file reconstructed by the receiver so far. I'm guessing the receiver already had the correct first 2129920 bytes of the file from a previous run, so the delta-transfer algorithm verifies quickly that those 2129920 bytes match the beginning of the source file and the size jumps to 2129920. From that point on, literal data has to be sent, so progress is slower.
(In reply to comment #1)
> As stated in the man page description of --progress, the size shown is the
> amount of the file reconstructed by the receiver so far. I'm guessing the
> receiver already had the correct first 2129920 bytes of the file from a
> previous run,
That's not true - it's a brand-new transfer. And ls -l on the file in the remote end does not show nearly as much (as expected from my connection).
> so the delta-transfer algorithm verifies quickly that those
> 2129920 bytes match the beginning of the source file and the size jumps to
Well, it's not what happens in this case.
> From that point on, literal data has to be sent, so progress is
I'm re-opening this bug, because your explanation was not true. Sorry for not stating earlier that this was a brand new transfer.
I see what is going on here. In the current rsync implementation, progress is always computed by the client, which in this case is the sender. The local ssh process has a 2 MB buffer for outgoing data (see CHAN_SES_WINDOW_DEFAULT in http://www.openbsd.org/cgi-bin/cvsweb/src/usr.bin/ssh/channels.h?rev=1.98;content-type=text%2Fx-cvsweb-markup ). The rsync sender fills up this buffer with data and counts that as 2 MB of progress.
The obvious approach to fixing this would be to always have the receiver compute progress, which would involve a protocol change. The problem is that, since rsync is pipelined, while the receiver is reporting progress for one file, the sender may have moved on to another file. Thus, to get consistent output, the sender would have to wait to output a filename until it gets acknowledgment that the receiver has started to receive the file. And that has a minor security ramification: a user who realizes that the source directory contains a file the remote server shouldn't see and interrupts rsync can no longer assume that the information has not been compromised if the name of the secret file was not in the output. Thoughts?