I would imagine that it is common to use rsync to rsync a file to a partial
download of itself (or a prefix of the file that might arise from something
being appended to the source file). However, for large files, this seems to be
extremely slow since many small chunks of constant size are compared.
While the --block-size option can help with this, the block size to use has to
be calculated for each rsync invocation to avoid retransmitting an average of
I would suggest that modifying the rsync algorithm to initially compare chunks
of exponentially increasing size until a mismatch is found would probably be
worth it in terms of the total bandwidth saved. Even if you disagree with that,
a quick-and-dirty fix would be an option that would cause rsync to check for the
case that the larger file results from the smaller file by appending data before
going into the full rsync algorithm. I believe this wouldn't take more than a
couple of minutes for someone familiar with rsync internals.
It certainly seems odd that rsync is essentially unusable for something that
wget --continue deals with.
To help searching for this bug: log files append appending live streams partial
download aborted download interrupted download restarting rsync restart
The --append option was added (it's in CVS now) to make it easy to continue the
sending of large files.
Because of the pipelined nature of the current rsync algorithm, it is not
possible for the two sides to interact in some kind of block-probing algorithm
(which would slow the algorithm due to round-trip delays).