Bug 7456 - exclude directory based on presence of a file
Summary: exclude directory based on presence of a file
Status: RESOLVED WONTFIX
Alias: None
Product: rsync
Classification: Unclassified
Component: core (show other bugs)
Version: 3.0.7
Hardware: All Linux
: P3 enhancement (vote)
Target Milestone: ---
Assignee: Wayne Davison
QA Contact: Rsync QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-05-26 11:48 UTC by Tim Ferrell
Modified: 2011-03-02 15:09 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tim Ferrell 2010-05-26 11:48:32 UTC
I think it would be handy to have the option of excluding directories when they contain a specially-named file, such as '.exclude'.  This would simplify our extensive exclude lists and would be a more explicit way to skip certain directories...

Thanks!
Comment 1 Wayne Davison 2010-05-29 09:30:29 UTC
If you put "- *" into that file and use the filter option "--filter=': .exclude", then rsync will exclude all the files/dirs from within that directory (just not the directory itself).  If you want to exclude the directories themselves, you can put a rule into the parent dir.  For instance, create the file .rsync-filter and put "- /subdir" into it (that slash anchors the name in the current dir).  Then, run rsync with the -FF option and it will use the filter files and also exclude them from the transfer.
Comment 2 Brian K. White 2011-03-01 17:42:34 UTC
That is a great work-around and I very much thank you for the idea.

But I am still adding my vote for this feature as originally requested. There are reasons why exactly this feature, just as described, exists elsewhere like the  "-F,FF ..." option in http://linux.about.com/library/cmd/blcmdl1_star.htm
and some other commercial and free backup systems.

Additionally I would wish to be able to specify the filename or pattern arbitrarily by command line option or environment variable, rather than only a  hard coded value like ".exclude".
Comment 3 Wayne Davison 2011-03-02 00:13:29 UTC
I don't see any need to add more than the per-dir filter files support for this.  You can always choose to build the filelist yourself, if you really want total control.
Comment 4 Brian K. White 2011-03-02 15:09:00 UTC
Ok you can say you don't feel like doing it, or including it if someone else makes a patch, but you can't say the existing features actually provide the requested semantic unless there is some exotic regex that could be used with one of the existing filter options, nor can you say "you don't really need that".

The form submission chopped off a lot of my original note which went into my actual use case and why other software has included exactly this semantic for years but I decided to leave it since it happened to include the feature request and my addition to it. I didn't think this feature really required explicit examples to prove that it fills a hole in the current ability to express inclusions/exclusions and I didn't think you would presume to know somehow that we don't actually need or want what we asked for for a non-spurious reason.

People need it often enough to request it and for it to appear in at least star and rdiff-backup and surely others. In various situations this is just the most efficient, automatic, reliable, self-documenting, self-propogating, self-administrable semantic to express the really desired behavior. We make do without only because we must.

True, I can and do currently provide essentially this desired behavior myself by dint of scripting around rsync. But This is true for much of rsync's current functionality so that's not much of an argument by itself.

And since I do in fact provide this behavior by scripting, it IS a low priority wish for me. I'm not suggesting otherwise.

But it's not a spurious request either just to cause you (or someone) work for no reason. It's not out of place, in that it's not asking rsync to do something other than help replicate files. It's also not breaking the tradition of keeping unix tools low level instead of all-singing-all-dancing, any more than the extensive list of features already in rsync does, or the -r flag to cp (when find and xargs exist), or the exec flag to find (when xargs exists) etc...

In my case the per-directory exclude file idea doesn't really work for me because I it happens I can't place arbitrary files with arbitrary names and arbitrary contents in the directories that I want to include or exclude. What I CAN do is tell developers "from now on, if you want a given database file to be excluded from the frequent daytime rsyncs, just create a blank screen named 'nosync' in that file". That's meaningless to you but it solves many problems for me at once. 

To me it means:
* developers are not screwing up things by diddling with files manually at the filesystem level
* the exclude marker file they created will be recognized by the various special binaries that comprise the closed source commercial DB & 4GL we use (filepro)
* if an excluded file is copied, the copy will also be excluded automatically (not so any other way).

This would all be not only possible but simple if I could tell rsync "use the .exclude feature, and look for files named screen.nosync instead of .exclude"

Just like
http://wiki.rdiff-backup.org/wiki/index.php/ExcludeIfPresent,

I wouldn't have to worry about maintaining directory lists on all the growing number of production boxes. That's a problem because the list of directories is not the same on every box, and not the same from day to day, and it's not really me or any one person who decides what should be excluded. And yet, I could still do reasonably centralized management because I happen to have the means to create a file on one box and clone it to the same place on all others, or delete it. And the developers do most work on a single development server which pushes out new code and db files out to all others.

And again, it's not just me. Star and rdiff-backup are just two examples I happened to already know right off the top of my head without googling. It IS a generically useful semantic whether you personally happen to have a use for it or not. The per-dir-filter is cool but what would come closer to what's really needed would be some exotic regex syntax that could be used in the existing filter options. That would possibly be even more useful than the requested feature since it would allow the user to express even more specialized rules.