Bug 7922 - rsync not using delta blocking when the source file different to partial file.
Summary: rsync not using delta blocking when the source file different to partial file.
Status: RESOLVED DUPLICATE of bug 7123
Alias: None
Product: rsync
Classification: Unclassified
Component: core (show other bugs)
Version: 3.0.7
Hardware: x86 Linux
: P3 normal (vote)
Target Milestone: ---
Assignee: Wayne Davison
QA Contact: Rsync QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-01-17 05:23 UTC by smiff
Modified: 2011-01-17 16:03 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description smiff 2011-01-17 05:23:38 UTC
I am not sure if I have found a bug or not, but rsync is certainly not doing what I would expect.

We do nightly database dumps, then use rsync to delta them over to the backup server. The file size is approx 100GB, but not a lot changes and rsync completes over a 2Mbit connection in about 3 hours.

We use the --partial-dir switch. 

The problem happens if the rsync fails due to connection error.

After this the source file changes again (the database is dumped again). Typically the source file has changed by only a few bytes.

Here is the bug - when rsync is run again, it uploads the ENTIRE file, which takes weeks.

It should EITHER:

Best case - delta with the partial file to determine what portion of it is still useful and valid, then continue with the delta blocking between the source and the destination.

OR

Simpler case - delete the partial file and start again. 

----------

REPRODUCE:

This can be recreated by following these steps, using -no-whole-file to force rsync to use delta algorithm.


mkdir /root/dir1 /root/dir2 /root/partial dd if=/dev/zero of=/root/dir2/myfile.out bs=1M count=10 cp /root/dir2/myfile.out /root/dir1/myfile.out

Put a 1 byte file in the partial directory to simulate, the partial file that was created by rsync last time it ran - but was disconnected.

echo hello > /root/partial/myfile.out

[root@]# rsync -a --verbose --progress --partial-dir=/root/partial /root/dir1/myfile.out /root/dir2/myfile.out sending incremental file list myfile.out
    10485760 100%   33.68MB/s    0:00:00 (xfer#1, to-check=0/1)

sent 10487116 bytes  received 32 bytes  20974296.00 bytes/sec total size is 10485760  speedup is 1.00


even before the rsync, dir1/myfile.out and dir2/myfile.out were identical so nothing needed to be sent (apart from checksums).

Why oh why does it send the whole file? When surely it should say, ok, the partial is different, I'll move on and delta between dir1 and dir2?
Comment 1 Paul Slootman 2011-01-17 05:35:13 UTC
Using a local transfer is different than a network transfer. From the manpage:

-W, --whole-file
    ....
    This is the default when both the source and destination are specified as local paths, ...
Comment 2 smiff 2011-01-17 05:40:10 UTC
(In reply to comment #1)
> Using a local transfer is different than a network transfer. From the manpage:
> 
> -W, --whole-file
>     ....
>     This is the default when both the source and destination are specified as
> local paths, ...
> 

As part of the example, I forgot to include the --no-whole-file parameter as I was trying to demo it on the same system.

But the same happens: if the new source file has slightly changed from partial upload file, it uploads everything and ignores the delta algorithm

[root@]# rsync -a --verbose --progress  --partial-dir=/root/partial --no-whole-file dir1/myfile.out dir2/myfile.out sending incremental file list myfile.out
    10485766 100%   10.21MB/s    0:00:00 (xfer#1, to-check=0/1)

sent 10487126 bytes  received 38 bytes  6991442.67 bytes/sec total size is 10485766  speedup is 1.00


Comment 3 Benjamin ANDRE 2011-01-17 06:14:46 UTC
What about using a temp dir (--temp-dir=DIR) that would be empty before each run of rsync ?

Benjamin ANDRE
Comment 4 Matt McCutchen 2011-01-17 16:03:52 UTC

*** This bug has been marked as a duplicate of bug 7123 ***