Bug 10857 - weirdly named files fail remotely
Summary: weirdly named files fail remotely
Status: RESOLVED WORKSFORME
Alias: None
Product: rsync
Classification: Unclassified
Component: core (show other bugs)
Version: 3.1.1
Hardware: x86 Linux
: P5 normal (vote)
Target Milestone: ---
Assignee: Wayne Davison
QA Contact: Rsync QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-10-07 17:40 UTC by samba.org@tange.dk
Modified: 2014-10-20 20:12 UTC (History)
0 users

See Also:


Attachments
Example of failing filename (1.51 KB, application/x-shellscript)
2014-10-07 17:40 UTC, samba.org@tange.dk
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description samba.org@tange.dk 2014-10-07 17:40:42 UTC
Created attachment 10329 [details]
Example of failing filename

The attached example creates a file that rsyncs locally, but fails to rsync remotely.
Comment 1 Wayne Davison 2014-10-10 21:08:03 UTC
Your problem is that you tried to specify a weird filename as a part of a remote filename without escaping the characters from the remote shell translating them.  You can use --protect-args to avoid the issue (which I have set as my default due to my using "./configure --with-protected-args" whenever I compile rsync).  You would also not have run into an issue if you had just specified a destination path and let rsync send the filename as a part of the file list.
Comment 2 samba.org@tange.dk 2014-10-12 15:07:06 UTC
I understand what is happening and I know the workarounds for the bug - that is not the issue.

Why is it that rsync prefers having a syntax, where if I transfer the file locally I need to write something, but if I transfer the file remotely I need to write something completely different?

Can you elaborate what situations this behaviour is beneficial and what would break if this was changed to being able to write the same name whether the transfer was local or remote?

I discovered the problem when writing a script, and I find it surprising that you prefer, that I deal with the quoting in my script if and only if the transfer is remote; instead of having rsync Do The Right Thing(TM) by default.

There may be situations where the current behaviour is preferable that I am unaware of, and it would be great if you could describe those.
Comment 3 Kevin Korb 2014-10-12 19:20:37 UTC
This isn't an rsync problem this is the way the shell works.  When you run rsync over ssh as you are doing there rsync is running 'ssh remotehost rsync [options] path'.  There is a shell between the sshd process and the remote rsync process.  Therefore you have 2 layers of shell involved and have to deal with both of them rather than just 1.

You would run into the same problem any time you double stack a shell:
echo "doesn't work"
vs
sh -c 'echo "doesn't work"'
Comment 4 samba.org@tange.dk 2014-10-12 22:44:07 UTC
Can we start by agreeing that rsync _could_ be aware that it is starting a remote shell and thus _could_ quote anything that needed quoting?

Currently it clearly does not quote and puts that responsibility on the user.

My question is: Why put that responsibility on the user? Why not help the user? Rsync knows in advance that files with weird characters will cause problems, so why not help the user by quoting them correctly?

In other words: What situations will break if --protect-args becomes the default?

Is rsync following the principle of least surprise by not having --protect-args be default, and thus requiring users to write the remote file differently than the local file?

I would say rsync is breaking that principle: I as a user assume that I can express the remote file name the same way as the local. There might be good reasons for breaking the POLS, but so far I have not heard any. So let me ask again: Are there any?

I can see how the current behaviour can be abused by a malicious user if the user knows that 'root' copies files like this:

    find . -type f -print0 | parallel -0 rsync {} remote:backup/{}

Just:

    touch 'foo`perl -e '"'"'print map{chr}114,109,32,47,101,116,99,47,112,97,115,115,119,100'"'"'|bash`bar'

and next time root copies, /etc/passwd will vanish with NO WARNING AT ALL.

Before I discovered this behaviour, I always assumed that:

    find . -type f -print0 | parallel -0 rsync {} remote:dir/{}

would be just as safe as:

    find . -type f -print0 | parallel -0 rsync {} dir/{}

So if you still want to keep the current behaviour, I believe the man page should be clear that giving the remote file (without --protect-args) may result in mayhem.
Comment 5 Wayne Davison 2014-10-18 17:56:26 UTC
If rsync begins quoting remote args, it would make assumptions about what needs to be quoted and its rules for quoting things.  There is also a historical use of arg-splitting that was primarily used on the source side, e.g. "rsync -a host:"file1 file2" /dest/" that has been replaced by "rsync -a host:{file1,file2} /dest/" (which results in 2 host:file* args).

If we assume that enough time has passed to discard the backward compatibility issue, the more proper solution is to avoid sending such args to the remote shell in the first place, and that is what --protect-args does.  So, the really we should just make --protect-args the default (forcing users interacting with older rsync versions to specify --no-s when they get an option error).  This is certainly what I have done for years now, and is probably what I should go ahead and make the default in an upcoming configure.

Folks can also affect their own rsync use by putting "export RSYNC_PROTECT_ARGS=1" in their environment, perhaps even in an /etc/profile.d/rsync.sh file.  We can at the very least make this clearer in the manpage.

As for this:

    find . -type f -print0 | parallel -0 rsync {} remote:backup/{}

The better way to do that is to specify -R (and -a) and omit the second {} (especially since the original way can fail if the destination dir doesn't exist).  FYI.
Comment 6 samba.org@tange.dk 2014-10-19 09:25:31 UTC
> As for this:
>
>    find . -type f -print0 | parallel -0 rsync {} remote:backup/{}

Is there also a better way for:

    find . -type f -size +1000 -print0 | parallel -0 rsync {} remote:backup/{}
Comment 7 Wayne Davison 2014-10-20 20:12:32 UTC
Those are both the same on the rsync side of the pipe.  Use: rsync -aR {} dest/ (where dest is a directory path or a host:directory combo for the root dir of the destination hierarchy).