Bug 4079 - rsync fails with --inplace, --link-dest and --no-whole-file
Summary: rsync fails with --inplace, --link-dest and --no-whole-file
Status: CLOSED FIXED
Alias: None
Product: rsync
Classification: Unclassified
Component: core (show other bugs)
Version: 2.6.8
Hardware: x86 Linux
: P3 major (vote)
Target Milestone: ---
Assignee: Wayne Davison
QA Contact: Rsync QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-09-04 10:22 UTC by Roger Lucas
Modified: 2006-10-15 13:26 UTC (History)
1 user (show)

See Also:


Attachments
Fix --inplace when basis file doesn't match dest file (1.43 KB, patch)
2006-09-17 18:18 UTC, Wayne Davison
no flags Details
Fix --inplace when basis file doesn't match dest file (1.22 KB, patch)
2006-09-17 18:25 UTC, Wayne Davison
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Roger Lucas 2006-09-04 10:22:17 UTC
System: Debian sarge (x86) with a locally compiled 2.6.8 version of rsync with the ACL patches applied.  Kernel updated to 2.6.16.20.

The command below fails if there is a reference file in "/tmp/test/b" that is slightly different to the file in "/tmp/test/a" and the destination file does not exist in "/tmp/test/c".

rsync -a --no-whole-file --inplace --link-dest=/tmp/test/b tmp/test/a/ tmp/test/c/

If rsync is run with -vvvv, the RSYNC algorithm appears to run correctly and determine the blocks in the file that are missing.  Unfortunately, only the changed data is written to the file in "/tmp/test/c/" and the rest of the file is zeros. 

I suspect this is because an empty destination file is created when the file doesn't exist rather than hard-linking the destination file to the link-dest file before performing the in-place copy.
Comment 1 Paul Slootman 2006-09-04 10:44:12 UTC
I can confirm this with the standard Debian version 2.6.7-1 and 2.6.8-2 (no ACL patches).
Comment 2 Roger Lucas 2006-09-04 10:57:44 UTC
The patch below will detect the condition where we are "inplace", have a destination file that does not exist and an existing reference file.  When this happens, the destination file will be hardlinked to the reference file before being opened.  This resolves the bug.  There is probably a cleaner way of detecting the missing destination file than trying to open it read-only and handling the fail state.

There is also the question of whether this is the correct behaviour since the modifications also get applied to the reference file because of the hard-link.  For me, the hard-link is what I would want to happen, but other people may prefer to have rsync copy the matching data from the reference file to the destination file so that the reference file is left unchanged.

diff -aur rsync-2.6.8/receiver.c wip/receiver.c
--- rsync-2.6.8/receiver.c      2006-09-04 16:48:46.000000000 +0100
+++ wip/receiver.c      2006-09-04 16:46:43.000000000 +0100
@@ -560,6 +560,21 @@

                /* We now check to see if we are writing file "inplace" */
                if (inplace)  {
+
+                       if ((fd1 != -1) && (fname != fnamecmp))
+                       {
+                               fd2 = do_open(fname, O_RDONLY, 0);
+                               if (fd2 < 0)
+                               {
+                                       rprintf(FINFO,"Adding hard-link %s -> %s\n", fname, fnamecmp);
+                                       do_link(fnamecmp, fname);
+                               }
+                   else
+                               {
+                                       close(fd2);
+                               }
+                       }
+
                        fd2 = do_open(fname, O_WRONLY|O_CREAT, 0600);
                        if (fd2 == -1) {
                                rsyserr(FERROR, errno, "open %s failed",
Comment 3 Paul Slootman 2006-09-04 11:21:15 UTC
Hardlinking and then modifying the existing file in the --link-dest directory
is absolutely the wrong thing to do IMNSHO, this will break many "incremental"
backup strategies such as dirvish. (Dirvish at this time does not use --inplace,
although I was just testing that out over the last couple of days... I guess I
can throw those backups away now :-( )

The manpage says:

     --link-dest=DIR         hardlink to files in DIR when unchanged

However, in these cases, the file _is_ changed, so hardlinking is the wrong
thing to do.
Comment 4 Roger Lucas 2006-09-04 11:59:32 UTC
Fair comment.  I don't feel comfortable digging into rsync to try and optimally copy over the reference file to the destination, however, so I am afraid that at this point I will have to request that someone more familiar with the rsync source fix this one.
Comment 5 Wayne Davison 2006-09-17 18:18:20 UTC
Created attachment 2142 [details]
Fix --inplace when basis file doesn't match dest file

Here's a patch that should fix the bug.
Comment 6 Wayne Davison 2006-09-17 18:25:08 UTC
Created attachment 2143 [details]
Fix --inplace when basis file doesn't match dest file

Actually, that one change prior to the ftruncate() call isn't right.  This patch leaves that out.
Comment 7 Wayne Davison 2006-09-17 18:28:24 UTC
Fix is now in CVS.