Hi, I use rsync for backup and once in a while somebody replaces a symbolic link with the contents of the file. In this case, I find that dry-run will report errors while rsync without -n flag works fine. A test case can be generated like this: mkdir old cd old mkdir data ln -s data link cd .. mkdir new mkdir new/link rsync -n -arlpogt --delete new/ old/ The rsync output will be: building file list ... done deleting data/ deleting link link/ Now when we delete data before, rsync fails: rmdir old/data rsync -n -arlpogt --delete new/ old/ building file list ... done deleting link rsync: opendir "/data1/ppe/tmp/old/link" failed: No such file or directory (2) ./ link/ Rsync fails with return code 23, however, if you remove the '-n' flag, rsync is able to sync and does not report errors. I think this is an error, because dry-run should just report that the symbolic link is updated, i.e.: building file list ... done ./ link/ and return no error code. For me this is also a problem, because I build scripts around rsync and I use dry-run for testing what will be done. The script makes than a backup of the files which would be deleted. But when dry-run fails, the script does not know whether its save to continue. Best, Peter.
I reproduced this bug with my custom rsync 2.6.6.matt.7. According to verbose output, delete_in_dir("link") seems to be getting called erroneously on the receiver. I'm trying to find the cause.
Created attachment 1702 [details] Makes receiving rsync process deletions only in directories I found the problem. The receiving rsync performs deletions by scanning the file list from the sender for directories. For each directory the sender sent, the receiver looks for something at the same path in the destination. If the receiver finds something, it assumes that something is a corresponding directory and processes deletions inside it; if the receiver finds nothing, it just moves on. That assumption is OK most of the time. If one runs Peter's second test case without --dry-run, rsync deletes the symlink "link" on the grounds that it is about to be replaced by a directory; when it goes to process deletions in the directory "link" that the sender named, it finds nothing there on the destination. But with --dry-run, "link" gets fake-deleted from the receiver. delete_in_dir("link") is called, sees something by the name of "link", incorrectly assumes it is a directory, and tries to process deletions. That no error is produced when the symlink is valid is immaterial. When rsync is completely in link-aware mode (--links and not --keep-dirlinks), it should treat symlinks simply as files of a special kind that contain strings; it should never issue a system call that follows a symlink. I wrote a very simple patch that makes delete_in_dir try to delete only in directories. That fixes the problem, but I don't know if it's the right way to fix the problem.
(In reply to comment #2) This sounds like a good solution. Thanks, I didn't expect a response so quickly! Shall I test this out with the latest rsync? You already did some testing, but perhaps my test might catch some side-effects. I have about 3 Terrabytes of data to 'rsync' over night. For my test, I would take the latest official source rpm and upgrade it to 2.6.6 plus your patch. Or, is there a test suite for rsync?
Yes, please do take a recent source RPM or source tarball and try the patch. If you use rpmbuild, you can get it to apply the patch automatically after unpacking the source tarball that comes with the SRPM. Put the patch in SOURCES/ of your RPM-building arena, and name it in rsync.spec like this: Patch0: delete-only-in-directories.diff Then insert after the %setup line: %patch0 -p0 There's a testsuite that you can run with "make test" in a built rsync source tree, either one you make manually or the one rpmbuild makes in BUILD/. That will check that the patch doesn't break other features. But this might be putting the carriage before the horse, as I'm still hoping Wayne will weigh in as to whether my patch is the right way to fix the problem. I think a "dry-run" test case should be added that tests the dry run feature comprehensively on -vv verbosity and makes sure it produces the same output as a real run in a number of borderline cases, such as the one in this bug.
(In reply to comment #4) OK, I will do that (have done rpm's before), but it will take some time, as I'm rather busy at the moment (~2 weeks). This way, Wayne also has some more time to respond. This bug certainly is not a critical one, but should be fixed. Best, Peter.
As Matt discovered, the bug is simply that when rsync is in a dry-run scenario, it may not have deleted an in-the-way file/symlink/device by the time the delete code decides to check if a sender-side directory exists on the receiving side. Thanks for the patch! It's the right fix, and I've checked it into CVS for the upcoming 2.6.7.
(In reply to comment #6) Great, so I do not need to do the testing. Thanks to all for the effort.