Bug 8445 - Add a non-trusted filter-file option that would limit the rules and ignore syntax errors
Summary: Add a non-trusted filter-file option that would limit the rules and ignore sy...
Status: ASSIGNED
Alias: None
Product: rsync
Classification: Unclassified
Component: core (show other bugs)
Version: 3.0.7
Hardware: x64 Linux
: P5 enhancement (vote)
Target Milestone: ---
Assignee: Wayne Davison
QA Contact: Rsync QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-09-08 18:03 UTC by Ruediger Meier
Modified: 2011-09-18 14:42 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ruediger Meier 2011-09-08 18:03:33 UTC
Hi,


I have setup a nightly backup pull via rsync like this:

$ rsync --version
rsync  version 3.0.7  protocol version 30
$ cd /backup/target
$ rsync  -axvSAHX  --delete --numeric-ids --relative --delete-excluded --bwlimit=10000 --filter='. /backup/conf/filter_glaukos_home'  root@glaukos:/exports/./data/   ./



Today noon I wondered why it's is still not ready.
After finding out that it was caused by a very large tmp dir of a user I told him to fix his local .rsync-filter.

While rsync was still pulling from the guilty "tmp-2011-09-05/" he changed his local filter from
...
+ /download*.xz
- /tmp
+ /tmp*.xz
...
to
...
+ /download*.xz
- /tmp*
+ /tmp*.xz
...

and moreover he deleted the whole directory "tmp-2011-09-05".

And rsync exited this way:

data/data-source/rolf/tmp-2011-09-05/99999_ICGE@wessa_2011-08-16.annot
data/data-source/rolf/tmp-2011-09-05/99999_ICGE@wessa_2011-08-16.rinse
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-07-28.annot"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-07-28.rinse"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-07-29.annot"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-07-29.rinse"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-07-30.annot"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-07-30.rinse"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-07-31.annot"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-07-31.rinse"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-08-01.annot"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-08-01.rinse"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-08-04.annot"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-08-04.rinse"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-08-05.annot"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-08-05.rinse"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-08-06.annot"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-08-06.rinse"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-08-07.annot"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-08-07.rinse"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-08-08.annot"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-08-08.rinse"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-08-10.annot"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-08-10.rinse"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-08-11.annot"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-08-11.rinse"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-08-12.annot"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-08-12.rinse"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-08-18.annot"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-08-18.rinse"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-08-20.annot"
file has vanished: "/exports/data/data-source/rolf/tmp-2011-09-05/999_IUDMIIF@boe_2011-08-20.rinse"
invalid modifier sequence at 't' in filter rule: -/tmp
rsync error: syntax or usage error (code 1) at exclude.c(829) [sender=3.0.7]
rsync: connection unexpectedly closed (130345973884 bytes received so far) [receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c(601) [receiver=3.0.7]
rsync: writefd_unbuffered failed to write 5 bytes to socket [generator]: Broken pipe (32)
rsync error: error in rsync protocol data stream (code 12) at io.c(1530) [generator=3.0.7]
----------------------------------------------------------------------------
rsnapshot encountered an error! The program was invoked with these options:
/usr/bin/rsnapshot -c /backup/conf/rsnapshot_glaukos_home_data.conf sync
----------------------------------------------------------------------------
ERROR: /usr/bin/rsync returned 12 while processing root@glaukos:/exports/./data/



This is IMO a very critical issue, since users may crash my backup every day.


cu,
Rudi
Comment 1 Wayne Davison 2011-09-10 21:06:51 UTC
(In reply to comment #0)
> invalid modifier sequence at 't' in filter rule: -/tmp

You'll note that rule is missing a space, so it was a fitler-rule syntax error.  Rsync treats a failure to parse filter rules as something that it should complain about in a fatal error so that you get a chance to fix it.

So, it seems to me that the issue here is that you're trusting user-generated filter rules in a backup situation, which may not be a good idea (e.g. consider the inclusion of a filter-rule import that references a secret file in order to try to sniff its contents).  What you could do instead is to do a pre-copy restrictive parse of all the filter files in the backup hierarchy and turn them into a single set of global rules, dropping any syntax error lines and ignoring any rules that shouldn't be trusted (you'd have to massage the paths and such).  Then, run rsync with that filtered global exclude list rather than the per-dir filter rules.

Another option is to mark the rules in the filter files as only hide rules (aka a server-side-only exclude) which avoids an unwanted protect-from-deletion effect of a normal exclude (thus users can specify things not to copy, but not prevent things from being removed on the backup server).  This also avoids any prefix/option interpretation in the per-dir files, so rsync won't generate an error reading when it is parsing the files.  e.g.:

--filter=':-s .rsync-exclude'

You'd need to filter all the .rsync-filter files, changing the "- foo" rules into just "foo" in the per-dir .rsync-exclude files for this to work, and let your users know about the new filename (if you indeed change it) and the new syntax.
Comment 2 Ruediger Meier 2011-09-12 10:45:12 UTC
Thx, for this detailed reply. After reading I think we have to 2 different issues here.


(In reply to comment #1)
> (In reply to comment #0)
> > invalid modifier sequence at 't' in filter rule: -/tmp
> 
> You'll note that rule is missing a space, so it was a fitler-rule syntax error.

1. I'am sure the there was never a syntax error in .rsync-filer. Instead the error occurred because the user added effectively a single character while rsync was reading it. (The same reason why bash scripts show syntax errors when editing them during execution).
So think it would be worth to improve rsync's way of reading the filter files all about because rsync is suppossed to run for hours to sync directories while they are used and it's able to handle vanished files etc.
I'd even wondered why rsync has read that particuar .rsync-filer again after being 10 hours inside that directory already. 
I haven't watched the source code but I guess it would help simply to avoid file operations like fseek on the filter files.



>  Rsync treats a failure to parse filter rules as something that it should
> complain about in a fatal error so that you get a chance to fix it.

2. So I this would put this on the wishlist:
 new option --ignore-broken-filters
 Behaves like in case of vanished files. Just print a warning but don't exit an ignore the broken filter. When sync is finished exit 2; 
 


> So, it seems to me that the issue here is that you're trusting user-generated
> filter rules in a backup situation, which may not be a good idea

Because all our users have to do with very large amount of data I want them to help me with the filter rules.


> (e.g. consider
> the inclusion of a filter-rule import that references a secret file in order to
> try to sniff its contents).

My users can only write the filter files into their own dirs. If they want to backup their own secrets then this is not my problem.


> What you could do instead is to do a pre-copy
> restrictive parse of all the filter files in the backup hierarchy and turn them
> into a single set of global rules, dropping any syntax error lines and ignoring
> any rules that shouldn't be trusted

This would be possible and I even though about this to implement more intelligent filters than simple in/exclude lists. But in practice
find /home -name ".rsync-filter"
takes about 1-2 hours here with high IO load on the file server and it would slow down the whole backup process about 20-30%.




> Another option is to mark the rules in the filter files as only hide rules 

A good idea regarding the security points above but regarding point 1 it woud be a fake. rsync would not exit with fatal error but would use a totally messed up filter if user changed it during backup process.


cu,
Rudi
Comment 3 Ruediger Meier 2011-09-18 14:42:57 UTC
(In reply to comment #2)

I have to admit that I was wrong about my point 1.
There was no error because the user changed the filter. It was another filter file from a different directory which was broken. All this together happened randomly within seconds so I got the wrong view on it.

Anyway point 2. is still on my wishlist and matches the subject of this enhancement request.