--min-size is documented in 2.6.5 (manpage and --help) but segfaults in any command I've tried it in. It's vanished in 2.6.6, and seems to exist only as an unofficial patch. Why is that? I don't understand why --max-size is there but --min-size isn't, and the misleading documentation of --min-size in 2.6.5's --help just screwed me, since I've been mere hours from setting up a backup system that depended on its presence. (--max-size worked just fine in testing; imagine my surprise when --min-size cored and compiling the latest rsync caused it to vanish altogether!) I'm on the mailing list and see no discussion of this change from the April timeframe, which was when I thought it went into the mainline rsync. It's also apparently not mentioned in any NEWS file. And searching for it in the bugzilla hasn't turned up anything relevant. Can this please be committed to the mainline version? P.S. I -am- very pleased to see that --max-size, at least, worked when pulling files from a 2.6.3 to a 2.6.5; I wasn't sure a priori if that would work, since it wasn't clear who might be doing the filtering. (My guess is, "the sender if it supports it, otherwise the receiver", but that's just a guess.) I'm hoping that adding --min-size won't break this behavior, since the rsync I'm pulling from may have to stay at 2.6.3 for a while.
Obtw, when this gets reinstated, it would be -really nice- if one or the other (but NOT BOTH) of --max-size or --min-size was an "OR EQUAL". Right now, if I'm trying to copy all files under one size to one place, and all files over one size to another place, it's quite inconvenient not to miss the files that are exactly on the boundary---instead of being able to use (say) --min-size=50M on one run and --max-size=50M on the other, I have to say --min-size=49999999 and --max-size=50000000 to be sure of not getting hit by the fencepost. This is error-prone at the very least (because I have to count digits precisely), and worse if I want the powers-of-two behavior---and by the way, the manpage, even for 2.6.6, is sloppy in mentioning the K, M, or G multipliers but in not specifying whether those are human-readable [10^3] or machine [2^10] multipliers---without reading the code, I have no idea. Thanks!
--min-size has never been released in a version of rsync except in the patchs dir. I know that Debian has released several versions with --min-size included (and has recently switched over to using the official patch, which does NOT dump core), but there may be other distributions that may have decided to include the option. This option is being considered for a future release, but seeing how you've compiled your own version, just apply the patch from the patches dir and enjoy. As for who needs to know about the option, the receiver is the one that implements the filtering logic for what files get transferred, so as long as you're pulling files, the sending rsync doesn't ever see options like --max-size, --update, --existing, etc. As for the size comparison boundry, one solution would be to allow any easy way to specify +1 or -1, such as --min-size=50k+1 or --max-size=50m-1. Yes, the man page should be improved to mention exactly what the suffixes mean -- thanks for pointing that out. I'm also thinking about allowing the suffix to specify if the user wants a K to be 1000 instead of 1024, such as suffixing the K, M, or G with a T to indicate that a power of ten is desired.
(In reply to comment #2) > --min-size has never been released in a version of rsync except in the patchs > dir. I know that Debian has released several versions with --min-size included > (and has recently switched over to using the official patch, which does NOT dump > core), but there may be other distributions that may have decided to include the > option. This option is being considered for a future release, but seeing how > you've compiled your own version, just apply the patch from the patches dir and > enjoy. Well, that explains it---Ubuntu Breezy (officially released yesterday, and thus around for the next six months) picked up a (defective) Debian version of it. Unfortunately, this means that every Breezy user who tries this option will have rsync core on them. (Maybe Breezy will pick up a fixed Debian version, but that actually doesn't help a lot---see below.) Furthermore, many Breezy users will not be nearly sophisticated enough to ask, "did Debian make an incompatible change?" Instead, they'll send bug reports. I've compiled my own version, but not having this in the non-Debian version is actually more of a big deal than you think (I think :). For one thing, it means I now have a quandry about -my- version. Do I blow away the Ubuntu one? Then updates will blow mine away. Do I nail the version so that doesn't happen? Then it becomes the one sore thumb that -won't- get updates, including security updates. Do I put it elsewhere instead? Then I have to worry about it being in the path of everything that uses it---including root, including random scripts, etc etc. It's a hassle and a waste of time---and risks being forgotten at an inopportune moment. Furthermore, since Debian has a version but rsync mainline doesn't, script writers are in a total quandry, since there's this incompatible feature they can't depend on being there. Sure, that's the case for every new feature in rsync, but it's particularly weird that max is there but min ain't. It makes writing scripts that say "put all the big files -here-, and all the little files -here-" suddenly become a pain to write and/or maintain. Plus, since even the Debian version has the fencepost issue, I have to kluge around it. And if you ever -do- release a version without it (either by parsing +1 at the end, or by changing < in min-size to <= as I did), then scripts that others have built based on the Debian one will be subtly wrong. This seems an enormous amount of pain and bookkeeping to handle a tiny change with, as far as I can see, no impact on the rest of rsync if it isn't being used. Not having it added in April seems senseless (if it had been there then, Breezy would certainly have it now, instead of an incompatible coredump), and seems doubly senseless now. I mean, it'd be one thing if it was a performance or stability issue, but it's clearly not, especially since --max-size made it in. (And no, saying "use find" isn't the solution, either---some uses of rsync, e.g., dirvish, make that really painful, again just to work around a tiny bit of non-orthogonality.) > As for who needs to know about the option, the receiver is the one that > implements the filtering logic for what files get transferred, so as long as > you're pulling files, the sending rsync doesn't ever see options like > --max-size, --update, --existing, etc. Ah. Good to know. (But wouldn't it be somewhat more efficient to have the sending rsync be able to apply filters if it can? Less wire traffic and maybe faster in the filesystem.) > As for the size comparison boundry, one solution would be to allow any easy way > to specify +1 or -1, such as --min-size=50k+1 or --max-size=50m-1. That's a cute idea. More work to code, but definitely cute. > Yes, the man page should be improved to mention exactly what the suffixes mean > -- thanks for pointing that out. I'm also thinking about allowing the suffix to > specify if the user wants a K to be 1000 instead of 1024, such as suffixing the > K, M, or G with a T to indicate that a power of ten is desired. I think the last thing we need is yet -another- incompatible way to say "human or machine?" I don't recall seeing T anywhere else to do this, but maybe some popular piece of software does this and I haven't noticed. (It's bad enough that some accept only lowercase kmg and some accept only uppercase!) du has clearly been having these issues, having gone for -h and -H and then deprecating one in favor of -si (yuck!) and who knows where that's gonna end. Are there any other utilities out there that seem popular and have settled this one way or the other? What do they do? Do they use upper vs lower case to decide it? Something else? (I honestly don't know, but I'm hoping somebody's thought about this...) Thanks. P.S. Would it have made sense to have opened this directly on the mailing list and not in the bug database? I guess it would have gotten it wider discussion; I can forward, or you can if you want, if you think it would help.
I saw the CVS checkin comments a couple days ago and just looked at them---thanks! (One tiny nit---you might want to mention in the manual that the +1/-1 are explicitly for avoiding fenceposts when using min/max-size. This -seems- obvious, but ya never know.) I'm not sure if I should open a bug report in Ubuntu Breezy to get them to apply pressure upstream to get either the current CVS or 2.6.7, when it's released (is there an estimate?). Ordinarily I'd just wait, but since Ubuntu shipped a -broken- Debianized (and soon-to-be-incompatible) version of this, it might be nice if Ubuntu and/or their upstream pushed out a newer version relatively soon. If you have any ideas (or would like to do the pushing yourself), let me know. [I actually just downloaded the latest nightly 'cause I needed, in addition to the +1/-1 logic, the fix to hardlinking and devices that was theoretically installed in April but was fixed again in late July---it just bit me and I spent an hour figuring out what was going on & working up a test case to send you, and -then- discovered from the NEWS file in CVS that you'd fixed it 10 weeks ago... :) Is there a regression test to make sure this doesn't get broken again? I've currently doing a long tetst to see if there are still problems in the hardlinking code & will send mail or open a new bug report if I see anything.] P.S. Just for bookkeeping, should this bug be changed from "resolved invalid" to "closed"?