Bug 14371 - Combined Exclude & Protect Filter Type
Summary: Combined Exclude & Protect Filter Type
Status: RESOLVED WONTFIX
Alias: None
Product: rsync
Classification: Unclassified
Component: core (show other bugs)
Version: 3.2.0
Hardware: All All
: P5 enhancement (vote)
Target Milestone: ---
Assignee: Wayne Davison
QA Contact: Rsync QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-05-06 17:05 UTC by Haravikk
Modified: 2020-05-23 22:57 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Haravikk 2020-05-06 17:05:13 UTC
This proposal is for a new filter rule that combines an exclude and a protect rule into a single line.


REASONING

When using rsync to perform backups, it can be possible to optimise performance by identifying in advance on the sending side any directories that haven't changed. For example, when rsync'ing from a macOS Time Machine backup, if you can compare against a previous backup you've already rsync'd across then it's a simple case of checking directory inodes, any directory whose inode matches the same directory in the previous backup is completely unchanged, and so can be skipped completely by rsync (no need to check it or its contents).

To do this, I generate a filter list with lines that look like the following:

- /path/to/unchanged/directory/
P /path/to/unchanged/directory/
- /path/to/another/unchanged/directory/
P /path/to/another/unchanged/directory/

This tells rsync to exclude the unchanged directories, but to not exclude them (when --delete-excluded is present, as it should be when taking backup snapshots).

Now this works great, the "problem" is that it requires two lines per entry to do what is a very useful trick, so what I would like to see is the addition of a combined option to do this. For example S for skip, so that I can do in a single line what I'm currently doing in two (resulting in a filter list that's twice the size it needs to be).

Alternatively, this could be added as an option for either the exclude or protect rule. So the possible syntaxes would be:

S /foo/bar/baz                Skip /foo/bar/baz
-,P /foo/bar/baz              Exclude /foo/bar/baz and protect from deletion
P,S /foo/bar/baz              Protect /foo/bar/baz from deletion and skip checking it

I prefer the first as it feels enough like a distinct operation to potentially warrant its own rule, but I'd be fine with either of the others. -,P is perhaps the most logical (allows the P rule as part of an exclude rule)?
Comment 1 Wayne Davison 2020-05-17 21:51:17 UTC
Just don't use --delete-excluded. For anything that you want to exclude on the sending side without excluding it on the receiving side you should use a "hide" filter rule instead. This way you'll never have 2 rules, only either an "H" rule or a "-" rule.
Comment 2 Wayne Davison 2020-05-17 21:51:17 UTC
Just don't use --delete-excluded. For anything that you want to exclude on the sending side without excluding it on the receiving side you should use a "hide" filter rule instead. This way you'll never have 2 rules, only either an "H" rule or a "-" rule.
Comment 3 Haravikk 2020-05-17 22:19:01 UTC
If I remove --delete-excluded then how do I ensure my backups remove items matching new exclusion rules? For example, if I identify a new cache folder or such that I don't want to copy, and it add to my exclusion rules, then surely I'd end up with it stuck on the receiving side if it's already there? Don't really want to have to rely on myself remembering to clear these manually every time (especially as it may be for multiple sync operations).
Comment 4 Wayne Davison 2020-05-23 06:32:23 UTC
You don't add an exclude rule, you add a hide rule. An exclude rule is a combination of a hide (server side) and a protect (client side). So you choose between the 3 idioms (hide, protect, exclude) depending on if you want the rule to affect one or both sides of the transfer.
Comment 5 Haravikk 2020-05-23 09:42:29 UTC
Oh, I see; so hide actually does what I need, you confused me with the mention of not using --delete-excluded, as it actually seems to work just fine with a mixture of hide and exclude rules for different items.

Thanks!
Comment 6 Haravikk 2020-05-23 22:30:49 UTC
In fact, no it doesn't, hide does not work as I'm requesting with --delete-excluded enabled, everything that is excluded is still destroyed on the receiving side:

    mkdir src
    mkdir dest
    touch src/file1
    touch src/file2
    touch dest/file2
    touch dest/file3
    rsync -ri --delete-excluded -filter 'H file3' --exclude 'file2' src/ dest/

Note that only file2 exists in both directories initially, meanwhile file1 exists only in the source, and file3 exists only in the destination. I want file1 to be transferred, file2 to be deleted in the destination, and file3 to be preserved in the destination.

In other words the result that I want to get is:

*deleting    file2
>f..T....... file1

The result I actually get is:

*deleting    file2
*deleting    file3
>f..T....... file1

So hide definitely doesn't do what I want.
Meanwhile protect alone doesn't either, because while it won't delete file3, it will still compare it between src/ and dest/ and overwrite it with changes, which I don't want either.

So I'm sorry but I'm re-opening this as it is not a case that appears to be covered by existing rules. I need rsync to NOT scan or send a pattern, but without deleting it either. As mentioned in the first post, this only seems to be possible through the use of two rules, and a shorthand form of this would still be beneficial.
Comment 7 Wayne Davison 2020-05-23 22:57:54 UTC
If you don't want something deleted on the receiving side, you need to protect it via either a protect rule or an exclude rule. Using --delete-excluded just turns all exclude rules into hide rules, which limits your options.