$ rsync -v --delete --progress -a --exclude-from=/a/file some directories here /mnt/backup/some/ sending incremental file list There it stays. With older (working) rsync I get: sent 899850 bytes received 260 bytes 1800220.00 bytes/sec total size is 9547899853 speedup is 10607.48 f303b749f2843433c9acd8218a4b9096d0d1bb8d is first bad commit commit f303b749f2843433c9acd8218a4b9096d0d1bb8d Author: Wayne Davison <wayned@samba.org> Date: Sat Aug 2 09:14:36 2008 -0700 Added logic to the receiving side to ensure that the --delete-during code will not delete in a directory prior to receiving an I/O error for that directory (or not receiving it, as the case may be). $ gstack 22366 #0 0x0000000000a24ad3 in __select_nocancel () from /lib64/libc.so.6 #1 0x00007fa36509b660 in read_timeout () from /proc/22366/exe #2 0x00007fa36509bb86 in read_loop () from /proc/22366/exe #3 0x00007fa36509a241 in readfd_unbuffered () from /proc/22366/exe #4 0x00007fa36509bd33 in readfd () from /proc/22366/exe #5 0x00007fa36509c956 in read_ndx () from /proc/22366/exe #6 0x00007fa36507cc20 in read_ndx_and_attrs () from /proc/22366/exe #7 0x00007fa365085ede in send_files () from /proc/22366/exe #8 0x00007fa36508e738 in client_run () from /proc/22366/exe #9 0x00007fa36508f58c in main () from /proc/22366/exe $ gstack 22367 #0 0x0000000000a24ad3 in __select_nocancel () from /lib64/libc.so.6 #1 0x00007fa36509b660 in read_timeout () from /proc/22367/exe #2 0x00007fa36509a40a in readfd_unbuffered () from /proc/22367/exe #3 0x00007fa36509bd33 in readfd () from /proc/22367/exe #4 0x00007fa36509c0f6 in read_msg_fd () from /proc/22367/exe #5 0x00007fa3650834d5 in generate_files () from /proc/22367/exe #6 0x00007fa36508e29c in do_recv () from /proc/22367/exe #7 0x00007fa36508ed82 in start_server () from /proc/22367/exe #8 0x00007fa36508fc55 in child_main () from /proc/22367/exe #9 0x00007fa3650a943e in local_child () from /proc/22367/exe #10 0x00007fa36508cb45 in do_cmd () from /proc/22367/exe #11 0x00007fa36508f53f in main () from /proc/22367/exe $ gstack 22368 #0 0x0000000000a24ad3 in __select_nocancel () from /lib64/libc.so.6 #1 0x00007fa36509b660 in read_timeout () from /proc/22368/exe #2 0x00007fa36509bb86 in read_loop () from /proc/22368/exe #3 0x00007fa36509a241 in readfd_unbuffered () from /proc/22368/exe #4 0x00007fa36509bd33 in readfd () from /proc/22368/exe #5 0x00007fa36509c956 in read_ndx () from /proc/22368/exe #6 0x00007fa36507cc20 in read_ndx_and_attrs () from /proc/22368/exe #7 0x00007fa3650845ca in recv_files () from /proc/22368/exe #8 0x00007fa36508e17c in do_recv () from /proc/22368/exe #9 0x00007fa36508ed82 in start_server () from /proc/22368/exe #10 0x00007fa36508fc55 in child_main () from /proc/22368/exe #11 0x00007fa3650a943e in local_child () from /proc/22368/exe #12 0x00007fa36508cb45 in do_cmd () from /proc/22368/exe #13 0x00007fa36508f53f in main () from /proc/22368/exe
Thanks for the report. I'll see about reproducing this.
Created attachment 3504 [details] Test patch I haven't been able to get rsync to hang for me yet. If you're still seeing the problem, does this patch fix it?
Unfortunately the patch did not help... (gdb) bt #0 0x0000003000cdcad3 in __select_nocancel () from /lib64/libc.so.6 #1 0x000000000043620b in read_timeout (fd=5, buf=0x7fffd49a38d0 "\001", len=4) at io.c:692 #2 0x000000000043710a in read_loop (fd=5, buf=0x7fffd49a38d0 "\001", len=4) at io.c:1025 #3 0x000000000043728e in readfd_unbuffered (fd=5, buf=0x7fffd49a4f50 "", len=1) at io.c:1062 #4 0x0000000000437a46 in readfd (fd=5, buffer=0x7fffd49a4f50 "", N=1) at io.c:1191 #5 0x0000000000439459 in read_ndx (f=5) at io.c:1797 #6 0x0000000000411507 in read_ndx_and_attrs (f_in=5, iflag_ptr=0x7fffd49a5154, type_ptr=0x7fffd49a515f "", buf=0x7fffd49a5160 "\200��, len_ptr=0x7fffd49a5150) at rsync.c:255 #7 0x000000000041d02a in send_files (f_in=5, f_out=4) at sender.c:185 #8 0x000000000042714b in client_run (f_in=5, f_out=4, pid=12999, argc=5, argv=0x1ab1240) at main.c:1053 #9 0x0000000000427c85 in start_client (argc=5, argv=0x1ab1240) at main.c:1301 #10 0x0000000000428347 in main (argc=6, argv=0x1ab1240) at main.c:1528 (gdb) bt #0 0x0000003000cdcad3 in __select_nocancel () from /lib64/libc.so.6 #1 0x000000000043620b in read_timeout (fd=3, buf=0x1ab35c0 "", len=8184) at io.c:692 #2 0x00000000004371d5 in readfd_unbuffered (fd=3, buf=0x7fffd49a2790 "H", len=4) at io.c:1048 #3 0x0000000000437a46 in readfd (fd=3, buffer=0x7fffd49a2790 "H", N=4) at io.c:1191 #4 0x0000000000435235 in read_msg_fd () at io.c:375 #5 0x0000000000435e92 in wait_for_receiver () at io.c:578 #6 0x0000000000419768 in generate_files (f_out=1, local_name=0x0) at generator.c:2242 #7 0x0000000000426a32 in do_recv (f_in=0, f_out=1, local_name=0x0) at main.c:846 #8 0x0000000000426e7e in do_server_recv (f_in=0, f_out=1, argc=1, argv=0x7fffd49a5258) at main.c:960 #9 0x0000000000426f69 in start_server (f_in=0, f_out=1, argc=2, argv=0x7fffd49a5250) at main.c:990 #10 0x0000000000426eba in child_main (argc=2, argv=0x7fffd49a5250) at main.c:967 #11 0x0000000000448ac3 in local_child (argc=2, argv=0x7fffd49a5250, f_in=0x7fffd49a7288, f_out=0x7fffd49a7284, child_main=0x426e95 <child_main>) at pipe.c:169 #12 0x0000000000425b3b in do_cmd (cmd=0x0, machine=0x0, user=0x0, remote_argv=0x1ab1270, remote_argc=0, f_in_p=0x7fffd49a7288, f_out_p=0x7fffd49a7284) at main.c:472 #13 0x0000000000427c30 in start_client (argc=5, argv=0x1ab1240) at main.c:1289 #14 0x0000000000428347 in main (argc=6, argv=0x1ab1240) at main.c:1528 (gdb) bt #0 0x0000003000cdcad3 in __select_nocancel () from /lib64/libc.so.6 #1 0x000000000043620b in read_timeout (fd=0, buf=0x7fffd499e6d0 "\001", len=4) at io.c:692 #2 0x000000000043710a in read_loop (fd=0, buf=0x7fffd499e6d0 "\001", len=4) at io.c:1025 #3 0x000000000043728e in readfd_unbuffered (fd=0, buf=0x7fffd499fd50 "", len=1) at io.c:1062 #4 0x0000000000437a46 in readfd (fd=0, buffer=0x7fffd499fd50 "", N=1) at io.c:1191 #5 0x0000000000439459 in read_ndx (f=0) at io.c:1797 #6 0x0000000000411507 in read_ndx_and_attrs (f_in=0, iflag_ptr=0x7fffd49a003c, type_ptr=0x7fffd49a004f "", buf=0x7fffd49a2050 "", len_ptr=0x7fffd49a0038) at rsync.c:255 #7 0x000000000041aaab in recv_files (f_in=0, local_name=0x0) at receiver.c:416 #8 0x000000000042684e in do_recv (f_in=0, f_out=1, local_name=0x0) at main.c:785 #9 0x0000000000426e7e in do_server_recv (f_in=0, f_out=1, argc=1, argv=0x7fffd49a5258) at main.c:960 #10 0x0000000000426f69 in start_server (f_in=0, f_out=1, argc=2, argv=0x7fffd49a5250) at main.c:990 #11 0x0000000000426eba in child_main (argc=2, argv=0x7fffd49a5250) at main.c:967 #12 0x0000000000448ac3 in local_child (argc=2, argv=0x7fffd49a5250, f_in=0x7fffd49a7288, f_out=0x7fffd49a7284, child_main=0x426e95 <child_main>) at pipe.c:169 #13 0x0000000000425b3b in do_cmd (cmd=0x0, machine=0x0, user=0x0, remote_argv=0x1ab1270, remote_argc=0, f_in_p=0x7fffd49a7288, f_out_p=0x7fffd49a7284) at main.c:472 #14 0x0000000000427c30 in start_client (argc=5, argv=0x1ab1240) at main.c:1289 #15 0x0000000000428347 in main (argc=6, argv=0x1ab1240) at main.c:1528
With "--info 2 --debug 2", it does not hang, but spits out some errors, and exits abruptly: (I gave five dirs as parameters -- let's call them dir0, dir1, dir2, dir3, dir4) ... [sender] make_file(dir0/filexyz,*,2) rsync: writefd_unbuffered failed to write 4 bytes [sender]: Broken pipe (32) server_recv(2) starting pid=19852 uid 500(safari) maps to 500 process has 5 gids: 37 500 509 546 598 gid 500(safari) maps to 500 recv_file_name(dir0) recv_file_name(dir2) gid 0() maps to 0 recv_file_name(dir3) recv_file_name(dir4) received 4 names recv_file_list done get_local_name count=4 /dest/dir/ [receiver] change_dir(/dest/dir) generator starting pid=19852 delta-transmission disabled for local transfer or --whole-file recv_generator(dir0,1) recv_generator(dir4,2) recv_files(4) starting recv_generator(dir3,3) recv_generator(dir2,4) [receiver] receiving flist for dir 0 rsync: connection unexpectedly closed (9 bytes received so far) [sender] _exit_cleanup(code=12, file=io.c, line=639): entered rsync error: error in rsync protocol data stream (code 12) at io.c(639) [sender=3.1.0dev] _exit_cleanup(code=12, file=io.c, line=639): about to call exit(12) sending incremental file list
The abnormal exit was a logging issue in 3.1.0 that I just checked-in a fix for (see latest git and "nightly" tar file). I have checked in some flushing changes that I hope will take care of the issue. If not, I can code up a work-around to avoid the hang.
> I have checked in some flushing changes that I hope will take care of > the issue. It was probably not clear that the issue mentioned in the second paragraph was referring to the original issue of the hang.
git 1d891835e7933f2c82bf636b0f27a936dc432e5c still hangs without --info or --debug, but with "--info 2 --debug 2" it finishes OK without errors.
This is now fixed in the latest 3.1.0dev version in the git repository and "nightly" tar file. I also removed the code that could hang from the 3.0.x branch, but the fix for the I/O error notification issue (that the hanging code was targeting) will now be handled only in the 3.1.0 release.