I am not sure if I have found a bug or not, but rsync is certainly not doing what I would expect.
We do nightly database dumps, then use rsync to delta them over to the backup server. The file size is approx 100GB, but not a lot changes and rsync completes over a 2Mbit connection in about 3 hours.
We use the --partial-dir switch.
The problem happens if the rsync fails due to connection error.
After this the source file changes again (the database is dumped again). Typically the source file has changed by only a few bytes.
Here is the bug - when rsync is run again, it uploads the ENTIRE file, which takes weeks.
It should EITHER:
Best case - delta with the partial file to determine what portion of it is still useful and valid, then continue with the delta blocking between the source and the destination.
Simpler case - delete the partial file and start again.
This can be recreated by following these steps, using -no-whole-file to force rsync to use delta algorithm.
mkdir /root/dir1 /root/dir2 /root/partial dd if=/dev/zero of=/root/dir2/myfile.out bs=1M count=10 cp /root/dir2/myfile.out /root/dir1/myfile.out
Put a 1 byte file in the partial directory to simulate, the partial file that was created by rsync last time it ran - but was disconnected.
echo hello > /root/partial/myfile.out
[root@]# rsync -a --verbose --progress --partial-dir=/root/partial /root/dir1/myfile.out /root/dir2/myfile.out sending incremental file list myfile.out
10485760 100% 33.68MB/s 0:00:00 (xfer#1, to-check=0/1)
sent 10487116 bytes received 32 bytes 20974296.00 bytes/sec total size is 10485760 speedup is 1.00
even before the rsync, dir1/myfile.out and dir2/myfile.out were identical so nothing needed to be sent (apart from checksums).
Why oh why does it send the whole file? When surely it should say, ok, the partial is different, I'll move on and delta between dir1 and dir2?
Using a local transfer is different than a network transfer. From the manpage:
This is the default when both the source and destination are specified as local paths, ...
(In reply to comment #1)
> Using a local transfer is different than a network transfer. From the manpage:
> -W, --whole-file
> This is the default when both the source and destination are specified as
> local paths, ...
As part of the example, I forgot to include the --no-whole-file parameter as I was trying to demo it on the same system.
But the same happens: if the new source file has slightly changed from partial upload file, it uploads everything and ignores the delta algorithm
[root@]# rsync -a --verbose --progress --partial-dir=/root/partial --no-whole-file dir1/myfile.out dir2/myfile.out sending incremental file list myfile.out
10485766 100% 10.21MB/s 0:00:00 (xfer#1, to-check=0/1)
sent 10487126 bytes received 38 bytes 6991442.67 bytes/sec total size is 10485766 speedup is 1.00
What about using a temp dir (--temp-dir=DIR) that would be empty before each run of rsync ?
*** This bug has been marked as a duplicate of bug 7123 ***