client MS Win XP SP2 Cygwin 1.5.12 -> server Linux 2.4.27 both using rsync version 2.6.3 protocol version 28 clinet command: rsync.exe -r -t -v "/cygdrive/w/MuhlpachrM/My Documents/" michalm@emil::test client output: Password: building file list ... done ./ Bluetooth/ Default.rdp ... some directories and files ... Doc/Eurotel Internet.txt Doc/Exchange_2003_TechNet.ppt rsync: writefd_unbuffered failed to write 4 bytes: phase "unknown" [sender]: Connection reset by peer (104) rsync: read error: Connection reset by peer (104) rsync error: error in rsync protocol data stream (code 12) at /home/lapo/packaging/tmp/rsync-2.6.3/io.c(515) But server do not hangs connection, it waits for input and then timeout ocur. Server log: 2005/01/03 00:48:03 [23119] rsyncd version 2.6.3 starting, listening on port 873 2005/01/03 00:48:19 [23121] rsync to test from michalm@muhlpachr-n (192.168.0.3) 2005/01/03 00:48:20 [23121] ./ 2005/01/03 00:48:20 [23121] Bluetooth/ ... some directories and files ... 2005/01/03 00:48:21 [23121] recv muhlpachr-n [192.168.0.3] test (michalm) Doc/Eurotel Internet.txt 800 2005/01/03 00:58:21 [23121] rsync error: timeout in data send/receive (code 30) at io.c(153) 2005/01/03 00:58:21 [23121] rsync: connection unexpectedly closed (25858 bytes received so far) [generator] 2005/01/03 00:58:21 [23121] rsync error: error in rsync protocol data stream (code 12) at io.c(359) Server strace (diffrent session with same client hang): [pid 23247] write(3, "\6\0\0\tWRAP/\n", 10) = 10 [pid 23247] lstat64("WRAP/book.pdf", 0xbfffd810) = -1 ENOENT (No such file or directory) [pid 23247] lstat64("WRAP/tb13.pdf", 0xbfffd810) = -1 ENOENT (No such file or directory) [pid 23247] lstat64("WRAP/wrap1c.pdf", 0xbfffd810) = -1 ENOENT (No such file or directory) [pid 23247] lstat64("assistance.pdf", 0xbfffd810) = -1 ENOENT (No such file or directory) [pid 23247] lstat64("desktop.ini", 0xbfffd810) = -1 ENOENT (No such file or directory) [pid 23247] lstat64("map.bat", 0xbfffd810) = -1 ENOENT (No such file or directory) [pid 23247] lstat64("předání konsolidace ECH FUBAR.doc", 0xbfffd810) = -1 ENOENT (No such file or directory) [pid 23247] lstat64("refman-5.0.pdf", 0xbfffd810) = -1 ENOENT (No such file or directory) [pid 23247] lstat64("reg.reg", 0xbfffd810) = -1 ENOENT (No such file or directory) [pid 23247] lstat64("reg.txt", 0xbfffd810) = -1 ENOENT (No such file or directory) [pid 23247] lstat64("unison-2.10.2-manual.html", 0xbfffd810) = -1 ENOENT (No such file or directory) [pid 23247] lstat64("unison-2.10.2-manual.pdf", 0xbfffd810) = -1 ENOENT (No such file or directory) [pid 23247] lstat64("valc-4.asx", 0xbfffd810) = -1 ENOENT (No such file or directory) [pid 23247] select(4, NULL, [3], NULL, {60, 0}) = 1 (out [3], left {60, 0}) [pid 23247] write(3, "\10\1\0\7x\3\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0y\3\0\0"..., 268) = 268 [pid 23247] select(7, [6], [], NULL, {60, 0} <unfinished ...> [pid 23248] <... select resumed> ) = 0 (Timeout) [pid 23248] time(NULL) = 1104732886 [pid 23248] select(4, [3], [], NULL, {60, 0} <unfinished ...> [pid 23247] <... select resumed> ) = 0 (Timeout) [pid 23247] select(7, [6], [], NULL, {60, 0} <unfinished ...> [pid 23248] <... select resumed> ) = 0 (Timeout) [pid 23248] time(NULL) = 1104732946 ... same select/time calls repeated ... [pid 23248] select(4, [3], [], NULL, {60, 0} <unfinished ...> [pid 23247] <... select resumed> ) = 0 (Timeout) [pid 23247] select(7, [6], [], NULL, {60, 0} <unfinished ...> [pid 23248] <... select resumed> ) = 0 (Timeout) [pid 23248] time(NULL) = 1104733426 [pid 23248] rt_sigaction(SIGUSR1, {SIG_IGN}, {0x8052a70, [USR1], SA_RESTORER|SA_RESTART, 0x4006a678}, 8) = 0 [pid 23248] rt_sigaction(SIGUSR2, {SIG_IGN}, {0x8052aa0, [USR2], SA_RESTORER|SA_RESTART, 0x4006a678}, 8) = 0 [pid 23248] unlink("Directory/.ISE Czech organigramm.ppt.KiG34J") = 0 [pid 23248] getpid() = 23248 [pid 23248] write(7, "A\0\0\10rsync error: timeout in data"..., 69) = 69 Client exit approx 10 seconds after last file synchronized, server drop connection and exit after timeout (600 seconds). Same behaviour (same client error) ocurs while I am using ssh transport. The file at which transfer hangs is diffrent every time, but the error is same. Thanks for help and fix.
The error you cite just indicates that the other end of the connection went away. Your trace tells us that it went away because it timed out, so this looks like it just boils down to the old issue of the cygwin version of rsync sometimes hanging (though this is the first I've seen of someone encoutering the hang when using daemon mode). I've never been able to duplicate this hang, so I've never been able to investigate it. This is the first mention of this hang in bugzilla (previous discussion has been confined to the mailing list), so this will be the official spot to help out with the rsync-on-cygwin hanging problem.
Server side (other end of connection) went away more than 9 minutes later than client, so server do not close TCP connection (it remains open until server timeout - tested by netstat). But client reports TCP connection closed nearly immeditely after last file reported. There is problem at client side, correct me if I am wrong please. This client hang can be reproduced anytime on file set I use (cca 500 MB of files). I am not able to finish any single rsync transfer. At diffrent client PC which has same configuration - at least both clients PCs was istalled from same sourcs and I do not see any diffrences (Win XP SP2, same version of Cygwin) everything is OK and no hangs happend. How can I help to investigate bug ?
I am having a very similar problem. Possibly my comments can crack this issue. Using several different servers. Slackware with rsync 2.6.3 and FreeBSD 5.2.1 with rsync 2.6.3 and clients (several) running cygwin 1.5.12(0.116/4/2) on various versions of windows. My testbed systems are Windows XP Pro SP2 and a Windows 2003 Server System. I am trying to backup these systems to any one of the rsync servers. My command on the client is: C:\rsync\rsync.exe --compress --delete --ignore-errors --recursive --verbose -- times --modify-window=5 --stats -- progress '/cygdrive/C/' 'username@host.server.com::share_name/test_server/test_ set/C/' What seems to happen is that transfer will stop on the client at some point in time (say 2:00am) and then the client and the server will both get errors about an hour later. The client gets rsync: writefd_unbuffered failed to write 16385 bytes: phase "unknown" [sender]: Connection reset by peer (104) rsync: read error: Connection reset by peer (104) rsync error: error in rsync protocol data stream (code 12) at /home/lapo/packaging/tmp/rsync-2.6.3/io.c(515) The server gets... rsync: writefd_unbuffered failed to write 43 bytes: phase "unknown" [generator]: Broken pipe (32) rsync error: error in rsync protocol data stream (code 12) at io.c(915) Looking at these it would seem that at some point the client and server simply stop communicating even though they are still connected. Possibly a network hiccup happened in the middle of a transfer, didn't see anything obvious in minitoring the nextowrk. They then enter this "unknown" state and while they are there they hang out for 1 hour or so - during this time they appear to think they are still connected. After this time they both die with the above errors. This is not specific to any fileset or any specific file. It might happen after running for several days, or after a few minutes.
Well I'm not having exactly the same problem, since my win32 clients does rsync well, but sometimes (not always) hangs after it's finished. So it won't get back to console after the additional status information has been printed. I'm using the same rsync batch script on 8 different clients, and only at 2 of them this problem (sometimes) occur. Unfortunately it's not reproducible on demand.
I too am having this problem. I had been working fine for years. I just upgraded to all new hardware. I went from Win2k to WinXP and from what ever I had before to current version of cygwin/rsync. I use this in my edit/compile/test cycle. The source tree is on the WinXP box. I run a script to rsync(on ssh) the tree to the server and then compile with ssh. Since upgrading the rsync fails about 90% of the time with: rsync error: error in rsync protocol data stream (code 12) at /home/lapo/packaging/tmp/rsync-2.6.3/io.c(515) The client dies after a few seconds and the server sits there for a while. The server is running Fedora Core 1 with rsync 2.5.7.
I tried copying rsync.exe from an older but working installation (Win2k). This was rsync 2.6.0. That did not fix the problem. Tonight I will try copying in an entire working c:\cygwin tree and see if that helps. I have determined that the problem only seems to occur when pushing files up to the server. I have NOT seen problems doing: - retrieving new files from the server - dry runs to see what is different - upload runs where no files need to be sent.
Hi guys My rSync hangs too, it's a cywin-rsync.. It's a little different, there is no error message The following is all that is shown: receiving file list ... done zTransfer/html/thumbs/12592.jpg zTransfer/html/thumbs/12593.jpg zTransfer/html/thumbs/12594.jpg <blah blah blah.. > zTransfer/html/thumbs/12596.jpg zTransfer/html/thumbs/12597.jpg Then it just hangs like this, nothing else happens. The rsync lasts around 10seconds I don't know what is going on, rsyncing to other hosts works fine, just this host is acting up. The host is in a LAN with me and connection is fine. Firewall is off, What else can be the problem? Any reply will be appreciated Thanks!
Has anyone tried commenting out HAVE_SOCKETPAIR from config.h to see if that is the problem? I mentioned somewhere that this seems to be the common case of failure -- it fails for a local transfer and for a transfer going through a remote shell, both of which use a socketpair() to communicate (unless HAVE_SOCKETPAIR is undefined, in which case a normal pipe() would be used).
I'm marking the cygwin hang bugs as "LATER" because this is a bug is in the cygwin pipe code, so it is outside rsync's control. We'll revisit this issue later after we hear that the cygwin code has been fixed. I wonder if specifying a --bwlimit might work around the problem by ensuring that the pipes can't fill up enough to deadlock. While we're waiting for a cygwin fix, give that a try.
(In reply to comment #9) > I wonder if specifying a --bwlimit might work around the problem by ensuring > that the pipes can't fill up enough to deadlock. While we're waiting for a > cygwin fix, give that a try. --bwlimit workaround works like a charm! You can reproduce this problem by sending large file (like 1GB) over a WAN link (like 1Mbps) using rsync on cygwin.
I am curious about whether a tweak to the way --bwlimit is implemented (e.g., moving the sleep_for_bwlimit call *before* the write in writefd_unbuffered, or perhaps lowering bwlimit_writemax) would make it a more effective workaround. I encourage anyone who is running up against the hang to do the rest of us a favor by investigating this.
The --bwlimit work-around did not fix the problem for me. Instead, I found this page: http://marc-abramowitz.com/archives/2007/10/14/solving-rsync-hangs-with-cygwin/ which described running rsync on the client against a TCP daemon service (rather than over ssh) which solved the problem for me. After setting up an rsyncd.conf file and installing the rsyncd daemon as a service on the remote machine: cygrunsrv --install "rsyncd" --path /usr/bin/rsync --args "--daemon --no-detach" --desc "Starts a rsync daemon for accepting incoming rsync connections" --disp "Rsync Daemon" --type auto it worked without a hitch and is no longer stalling on large files.