Bug 12036 - Multiple --link-dest, --copy-dest, or --compare-dest flags produce incorrect behavior
Multiple --link-dest, --copy-dest, or --compare-dest flags produce incorrect ...
Status: NEW
Product: rsync
Classification: Unclassified
Component: core
3.1.2
All Linux
: P5 normal
: ---
Assigned To: Wayne Davison
Rsync QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2016-07-25 22:52 UTC by Chris Kuehl
Modified: 2016-07-25 22:54 UTC (History)
0 users

See Also:


Attachments
reproduction (349 bytes, application/x-shellscript)
2016-07-25 22:52 UTC, Chris Kuehl
no flags Details
proposed patch (1.02 KB, patch)
2016-07-25 22:54 UTC, Chris Kuehl
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Kuehl 2016-07-25 22:52:42 UTC
Created attachment 12288 [details]
reproduction

We have observed what seems like incorrect behavior when using a command like this:

    rsync -avc --link-dest=../copy_dest/good --link-dest=../copy_dest/bad src/ dest

with a directory stucture that looks like this:

    .
    ├── copy_dest
    │   ├── bad
    │   │   └── file       # contains different content from src, but same attributes (e.g. mtime)
    │   └── good
    │       └── file       # contains same content as src, but different attributes (e.g. mtime)
    └── src
        └── file

Using the command above, we see that "bad/file" is hard-linked into "dest", even though it is different from "file" in src.

I've attached repro.sh which reliably reproduces this for me on the latest version of rsync.
Comment 1 Chris Kuehl 2016-07-25 22:53:49 UTC
Looking through the code, this sticks out to me:

    static int try_dests_reg(struct file_struct *file, char *fname, int ndx,
                 char *cmpbuf, stat_x *sxp, int find_exact_for_existing,
                 int itemizing, enum logcode code)
    {
        STRUCT_STAT real_st = sxp->st;
        int best_match = -1;
        int match_level = 0;
        int j = 0;

        do {
            pathjoin(cmpbuf, MAXPATHLEN, basis_dir[j], fname);
            if (link_stat(cmpbuf, &sxp->st, 0) < 0 || !S_ISREG(sxp->st.st_mode))
                continue;
            switch (match_level) {
            case 0:
                best_match = j;
                match_level = 1;
                /* FALL THROUGH */
            case 1:
                if (!unchanged_file(cmpbuf, file, &sxp->st))
                    continue;
                best_match = j;
                match_level = 2;
                /* FALL THROUGH */
            case 2:
                if (!unchanged_attrs(cmpbuf, file, sxp)) {
                    free_stat_x(sxp);
                    continue;
                }
                best_match = j;
                match_level = 3;
                break;
            }
            break;
        } while (basis_dir[++j] != NULL);

It looks to me like on the first iteration of the loop, we match all the way through to "match_level = 2" with the file from "copy_dest/good", which has the same content. The mtime doesn't match, though, so we break there.

On the second iteration of the loop, "match_level" is still 2, and we only compare the attributes with the file from "copy_dest/bad" (never the content). We then break from the loop and link the wrong file into dest.

I've attached my attempt at a patch to correct this.
Comment 2 Chris Kuehl 2016-07-25 22:54:40 UTC
Created attachment 12289 [details]
proposed patch