Bug 14081 - --copy-command option for specifying custom file copying behaviour
Summary: --copy-command option for specifying custom file copying behaviour
Status: NEW
Alias: None
Product: rsync
Classification: Unclassified
Component: core (show other bugs)
Version: 3.1.3
Hardware: All All
: P5 enhancement (vote)
Target Milestone: ---
Assignee: Wayne Davison
QA Contact: Rsync QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-08-09 12:51 UTC by Haravikk
Modified: 2019-08-09 15:05 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Haravikk 2019-08-09 12:51:17 UTC
This proposal is for the addition of an option enabling fully custom copying behaviour in rsync, i.e- leveraging rsync primarily for detection of changes.

Custom copying behaviour is added using the --copy-command option, which takes a variable number of arguments in a similar same style to the `find` command's `-exec` option, terminated with a semi-colon.

For local copying, the special arguments `{src}` and `{dest}` can be used, and will be substituted for absolute paths for the source file and its destination respectively. For example:

    rsync --copy-command cp {src} {dest} \; "$source" "$destination"

To perform a completely pointless copy using `cp` to absolutely no advantage.

When dealing with a remote source, the copy command will receive the file's data as standard input. When dealing with a remote target, the copy command should produce file data as standard output.

Custom copying behaviour can be useful in a number of situations where copying tools have space-saving features, but do not have change detection or filtering options as flexible as `rsync` does.

Note: the custom behaviour is *only* triggered when rsync determines that a file's data should be copied, simple attribute, ownership and permission changes would occur as normal, without invoking the custom command. Likewise, directories etc. will be handled as normal. When a file is to be copied, rsync will skip its normally attempt to find differences with the target, and instead invoke the custom copy command, but will still synchronise any basic attributes (owner, permissions etc.).


For a useful example, on macOS there are some particular use cases. One invokes the `ditto` command to trigger compression of a file when transferred to an APFS or HFS+ target, which could be taken advantage of like so:

    rsync --copy-command ditto --hfsCompression {src} {dest} \; "$source" "$destination"

Since this enables a form of transparent compression, rsync should still see the copied file as identical to the original. Alternatively, the --noHfsCompression option could be used instead to ensure that compressed files are decompressed, even if the target would have been compatible.

Another useful macOS example is the use of cloning within an APFS volume, which will produce an instantaneous copy requiring no additional space, and can be used like so:

    rsync --copy-command cp -c {src} {dest} \; "$source" "$destination"

Again, since the copy is a clone it should be recognised as identical to the source by rsync on future passes.

In more complex cases, the copy command may produce a file that rsync cannot compare normally, for example, if the file is passed through `gz` or `xz`, in which case the use of checksums and size for comparison will need to be set accordingly.


While it would be nice to see cloning and HFS compression features added directly to rsync, this seemed like a much more flexible alternative, as it also potentially enables the use of various compression and/or encryption tools during transfers as well.
Comment 1 Haravikk 2019-08-09 15:05:41 UTC
Sorry, just occurred to me that rsync already has a similar style option in the `-rsh` flag for setting a custom remote shell command, so rather than a `find -exec` style it might make more sense to copy `-rsh` for consistency.

This means commands would take a form like:

    rsync --copy-command 'cp -c {src} {dest}' /path/to/source /path/to/destination

i.e- the command is executed in a style similar to `bash -c`, but with special items substituted.

I wonder actually if instead of custom special {src} and {dest} it might make sense to just pass these as $1 and $2?

Of course it's not the implementation details that matter most, but the functionality, any solution that can account for at least the local copy case would be very useful in certain cases, while a solution that also covers usage with remote source/destination can always come later.