Bug 8712 - --link-dest doesn't work if target file exists (but needs updating)
Summary: --link-dest doesn't work if target file exists (but needs updating)
Status: RESOLVED DUPLICATE of bug 5644
Alias: None
Product: rsync
Classification: Unclassified
Component: core (show other bugs)
Version: 3.0.7
Hardware: All Linux
: P5 normal (vote)
Target Milestone: ---
Assignee: Wayne Davison
QA Contact: Rsync QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-01-20 23:19 UTC by Brian J. Murrell
Modified: 2013-01-18 20:01 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Brian J. Murrell 2012-01-20 23:19:37 UTC
Given two files:

# /bin/ls -li {source,dest}/var/lib/mysql/mythconverg/jobqueue.MYD
44695517 -rw-rw---- 2 sshd messagebus 5492 2012-01-19 00:31 source/var/lib/mysql/mythconverg/jobqueue.MYD
49676928 -rw-rw---- 1 sshd messagebus 5492 2012-01-18 12:00 dest/var/lib/mysql/mythconverg/jobqueue.MYD

Their md5sums:

# md5sum {source,dest}/var/lib/mysql/mythconverg/jobqueue.MYD
b87dafdcf59ab3e7e9907c15385175b3  source/var/lib/mysql/mythconverg/jobqueue.MYD
c03b1ee584ef1405958b695f4e55b51d  dest/var/lib/mysql/mythconverg/jobqueue.MYD

Let's try to create a link from
source/var/lib/mysql/mythconverg/jobqueue.MYD to
dest/var/lib/mysql/mythconverg/jobqueue.MYD

Dry-run first to see what rsync tells us it's going to do:

# rsync -naiiAXH --link-dest=/source/var/lib/mysql/mythconverg/ source/var/lib/mysql/mythconverg/jobqueue.MYD dest/var/lib/mysql/mythconverg/
> f..t...... jobqueue.MYD
Looks like it's not going to create a link.  Let's be sure:

# rsync -aiiAXH --link-dest=/source/var/lib/mysql/mythconverg/ source/var/lib/mysql/mythconverg/jobqueue.MYD dest/var/lib/mysql/mythconverg/
> f..t...... jobqueue.MYD
Yup, looks like it didn't create a link.  The proof:

# /bin/ls -li {source,dest}/var/lib/mysql/mythconverg/jobqueue.MYD
44695517 -rw-rw---- 2 sshd messagebus 5492 2012-01-19 00:31 source/var/lib/mysql/mythconverg/jobqueue.MYD
49676897 -rw-rw---- 1 sshd messagebus 5492 2012-01-19 00:31 dest/var/lib/mysql/mythconverg/jobqueue.MYD

Yet it did copy the file:

# md5sum {source,dest}/var/lib/mysql/mythconverg/jobqueue.MYD
b87dafdcf59ab3e7e9907c15385175b3  source/var/lib/mysql/mythconverg/jobqueue.MYD
b87dafdcf59ab3e7e9907c15385175b3  dest/var/lib/mysql/mythconverg/jobqueue.MYD

Why didn't it remove the existing file and create the link like we
wanted it to with --link-dest?  Let's try another file just to prove
that it will create a link if the destination file doesn't exist:

Two different files, same use case though:

# /bin/ls -li {source,dest}/var/lib/mysql/mythconverg/jobqueue.MYI
44695518 -rw-rw---- 2 sshd messagebus 5120 2012-01-19 00:31 source/var/lib/mysql/mythconverg/jobqueue.MYI
49676956 -rw-rw---- 1 sshd messagebus 5120 2012-01-18 12:00 dest/var/lib/mysql/mythconverg/jobqueue.MYI

Different content again:

# md5sum {source,dest}/var/lib/mysql/mythconverg/jobqueue.MYI
6a0b5bdedfe738fbca17630e6496f85c  dest/var/lib/mysql/mythconverg/jobqueue.MYI
9052f8e598f8ee3bdcb2cbc510f0564c  source/var/lib/mysql/mythconverg/jobqueue.MYI

See what rsync says it's going to do (before we remove the target file):

# rsync -naiiAXH --link-dest=/source/var/lib/mysql/mythconverg/ source/var/lib/mysql/mythconverg/jobqueue.MYI dest/var/lib/mysql/mythconverg/
> f..t...... jobqueue.MYI
Again, it's going to copy.

But if we remove the target file and see what rsync says it's going to do:

# rm dest/var/lib/mysql/mythconverg/jobqueue.MYI
# rsync -naiiAXH --link-dest=/source/var/lib/mysql/mythconverg/ source/var/lib/mysql/mythconverg/jobqueue.MYI dest/var/dest/var/lib/mysql/mythconverg/  
hf          jobqueue.MYI

Ah ha!  Say's it's going to link.  Let's actually do it and see:

# rsync -aiiAXH --link-dest=/source/var/lib/mysql/mythconverg/ source/var/lib/mysql/mythconverg/jobqueue.MYI dest/var/lib/mysql/mythconverg/   
hf          jobqueue.MYI

And sure enough, we now have three links to the same file (as we would
expect):

# /bin/ls -li {source,dest}/var/lib/mysql/mythconverg/jobqueue.MYI
44695518 -rw-rw---- 3 sshd messagebus 5120 2012-01-19 00:31 source/var/lib/mysql/mythconverg/jobqueue.MYI
44695518 -rw-rw---- 3 sshd messagebus 5120 2012-01-19 00:31 dest/var/lib/mysql/mythconverg/jobqueue.MYI

And the content (although we know it should report the same):

# md5sum {source,dest}/var/lib/mysql/mythconverg/jobqueue.MYI
9052f8e598f8ee3bdcb2cbc510f0564c  dest/var/lib/mysql/mythconverg/jobqueue.MYI
9052f8e598f8ee3bdcb2cbc510f0564c  source/var/lib/mysql/mythconverg/jobqueue.MYI

So why doesn't the same thing work when the destination file exists?
The element of least surprise would make me think it should.  That is,
even if the destination file exists, if the result of the sync is the
same as a file that a --link-dest points to, it should remove the
destination file and create the link.

Discussion the rsync list:

On 01/20/12 17:56, Brian J. Murrell wrote:
> On 12-01-20 05:42 PM, Kevin Korb wrote:
>> Am I understanding right that your source and your link-dest are
>> actually the same path?
>
> Yes they are!
>
>> If so what are you trying to do that wouldn't be accomplished
>> with a simple 'cp -l'?
>
> Not all of the files in source and dest are supposed to be the
> same.  I only want to link the files that are.
>
>> Also, when using --link-dest the target is treated differently.
>> Normally the target is an empty directory but if files already
>> exist the behavior you are seeing is what is supposed to happen.
>> Instead of deleting files and replacing them with links it
>> assumes that they are already linked to whatever they are
>> supposed to be linked to and updates them accordingly.
>
> That's a pity.  Understandable perhaps for the case where there is
> more than one link to the destination file, but if there is not
> (i.e. there is only one link to the file, such as the example I
> provided) it should feel free to unlink it and relink it to the
> -link-dest specified file. Additionally, I would think an option to
> force that behavior even if there is >1 link would be useful.
Comment 1 Tobias Dussa 2012-05-24 15:06:22 UTC
I'm also hit by this issue.  Is there any prediction as to whether and, if so, when something is going to happen? :)

THX!

Cheers,
Toby.
Comment 2 Teodor Milkov 2013-01-15 14:00:40 UTC
Isn't this duplicate to https://bugzilla.samba.org/show_bug.cgi?id=5644

There's even patch in 5644. Such a behaviour (unlink changed files and then hard link to dest dir) would be very handy, because rotating large directory trees (e.g. 10 milion files, 10k files changed) is sooo much more efficient than deleting them and then repopulating from scratch.
Comment 3 Brian J. Murrell 2013-01-17 23:54:00 UTC
(In reply to comment #2)
> Isn't this duplicate to https://bugzilla.samba.org/show_bug.cgi?id=5644

It does seem like it, although I'm not 100% they are discussing the same thing.  Might be worthwhile asking on bug 5644 for a consult.
Comment 4 Wayne Davison 2013-01-18 20:01:04 UTC
This is changing for 3.1.0.  See bug 5644 for a bit more info.

*** This bug has been marked as a duplicate of bug 5644 ***