Problem: When a mounted dir(for example, "mnt") cannot be visited, rsync will halt there and the shell is halted, even "ctrl -c" can't quit it. If the mountd dir is filtered out, rsync still fail to work. dirA_________dir1 |__ mnt dirB_________dir1 $rsync -r -vvvvvvv -L --include-from=filter --exclude="*" dirA/ dirB/ Root cause: related brief process of rsync: 1 client_run(), called once the connection has been negotiated. 2 read the filter. 3 send_file_list(). 4 send_files(). In step 3, rsync will examine the file one by one, to decide whether it is a symblic, whether it is filtered, and so on. It revokes do_stat to check the file status. int do_stat(const char *fname, STRUCT_STAT *st) { #if HAVE_OFF64_T return stat64(fname, st); #else return stat(fname, st); #endif } It revokes system function "stat", which will halt when the mounted dir is crashed. The "stat" issue is not due to rsync. But if the mountd dir is filtered out, rsync should work well. Solution: Modified "flist.c", struct file_struct *make_file(char *fname, struct file_list *flist, STRUCT_STAT *stp, unsigned short flags, int filter_level) { ... memset(sum, 0, SUM_LENGTH); /* In case a mounted dir crashed, than stat will hang there even the dir has been filtered */ if (filter_level != NO_FILTERS && is_excluded(thisname, 1, filter_level)) { rprintf(FINFO, "Excluding %s\n", thisname); return NULL; } else if (stp && S_ISDIR(stp->st_mode)) { st = *stp; /* Needed for "symlink/." with --relative. */ *linkname = '\0'; /* make IBM code checker happy */ } else if (readlink_stat(thisname, &st, linkname) != 0) { ... } }
Created attachment 2873 [details] Suggested solution of the bug In fact, the root cause is "stat", I have reproduced it both on Linux and Solaris. A suggested solution is: when a mounted dir is filtered out, rsync should work well.
We need to stat() the file to know if it is a directory or not. Otherwise the directory-specific exclude rules won't work right. So, the only possible solution that would allow the stat() to be skipped would be to allow the exclude code to deal with an "unknown" file/dir status, and have it stat() a name that matched a dir-only exclusion rule to be sure that it was indeed a dir (at which point the file's dir-ness would be remembered).
Yes, I agree with you. The core is to skip stat() if the dir is filtered. In fact in my solution, I assumed the "thisname" is a dir to check whether it is filtered. 1) if "thisname" is a dir, is_excluded(thisname, 1, filter_level) return 1 and no problem occurs. 2) if "thisname" is a file, because it match the filter_list, here is_excluded(thisname, 1, filter_level) also return 1 and no problem occurs. Thus I prefer change the make_file() a little, rather than change the exclude code. (In reply to comment #2) > We need to stat() the file to know if it is a directory or not. Otherwise the > directory-specific exclude rules won't work right. > So, the only possible solution that would allow the stat() to be skipped would > be to allow the exclude code to deal with an "unknown" file/dir status, and > have it stat() a name that matched a dir-only exclusion rule to be sure that it > was indeed a dir (at which point the file's dir-ness would be remembered).
(In reply to comment #3) > In fact in my solution, I assumed the "thisname" is a dir to check whether it > is filtered. As Wayne tried to point out, while your solution does fix the hanging on the mounted dir, it is unacceptable because it breaks directory-specific exclude rules. If the source contains a regular file X and a rule --exclude='X/', rsync will assume X is a directory and exclude it, which is wrong.
Created attachment 2876 [details] Alternative solution: check exclusion both ways The attached patch gives an alternative way to fix the hanging without breaking directory-specific exclude rules. make_file calls is_exclude twice to determine whether the file would be excluded if it is a directory and whether it would be excluded if it isn't one. If the file would be excluded either way, make_file skips calling stat. This solution has two disadvantages: the filter list is scanned twice (hurting performance), and at -vv level, hypothetical "showing/hiding file/directory X because of pattern Y" messages are printed.