I am trying to backup a Windows XP SP2 workstation to a Debian GNU/Linux server using ssh, cygwin and rsync. It appears the XP rsync locks up after transferring some files (it transfers a few more files each time it is attempted) but hangs every time. The XP rsync process must be killed manually, even if I cancel the rsync on the Linux side. The Linux account can ssh into XP using client keys. I can scp the entire source data without any trouble. No XP event log entries (except key authentication & service start-up). The Linux initiated rsync command: rsync -vrte ssh --stats --progress xpuser@xpsource:/cygdrive/c/Data/ /home/usershare/xpuser/ I can initiate the rsync from the XP machine and it runs smoothly: rsync -vrte ssh --stats --progress /cygdrive/c/Data/ debuser@destin:/home/usershare/xpuser/ Debian Sarge Destination: Linux 2.4.26-1-686 rsync version 2.6.3 protocol version 28 OpenSSH_3.8.1p1 Debian-8.sarge.4, OpenSSL 0.9.7e 25 Oct 2004 Windows XP Pro SP2 Source: Athlon 64 3200+ cygwin 1.5.12-1 cygrunsrv 1.0-1 rsync 2.6.3-1 openssh 3.9p1-2 The rsync-debug script dies immediately without transferring any files: protocol version mismatch - is your shell clean? (see the rsync man page for an explanation) rsync error: protocol incompatibility (code 2) at compat.c(60) The log file on XP contains: rsync: writefd_unbuffered failed to write 4 bytes: phase "unknown" [sender]: Broken pipe (32) rsync error: error in rsync protocol data stream (code 12) at /home/lapo/packaging/tmp/rsync-2.6.3/io.c(909) This created a zero length out.dat file: ssh xpuser@xpsource /bin/true > out.dat Searching Google, this seemed to be the closest match: http://www.linuxquestions.org/questions/history/265520 any help? Your time and expertise are greatly appreciated. Please forgive my newb-ness. Let me know if I should try anything else. Chris
If you could try commenting out HAVE_SOCKETPAIR from config.h and re-compiling rsync, it would be nice to know if that makes rsync stop hanging. If you don't have the cygwin source, you should be able to use their setup.exe tool to grab it and build it using their patches (such as the one to open temp files in binary mode).
(In reply to comment #0) See also http://www.cygwin.com/ml/cygwin/2003-10/msg00129.html
After commenting out HAVE_SOCKETPAIR, the behavior remained similar: rsync started remotely over ssh transfered a few files and then froze. It looks like it transfered the file differences still. runtests.sh failed on the deamon test while compiling. Attached is the compile output. I just tried initiating rsync using ssh. (In reply to comment #1) > If you could try commenting out HAVE_SOCKETPAIR from config.h and re-compiling [...snip...]
Created attachment 1024 [details] rsync compile on cygwin with HAVE_SOCKETPAIR commented from config.h
What is working: I have rsyncd running from a startup batch file on Windows/cygwin: c:\cygwin\bin\rsync.exe --config=/cygdrive/c/cygwin/etc/rsyncd/rsyncd.conf --daemon --no-detach --address localhost SSHD is running as a service using cygrunsrv. Trying to run rsyncd as a service gives me "event: rsyncd : PID 1468 : starting service `rsyncd' failed: signal 11 raised." - is there a way to get better information on the error? rsync on the Linux backup server can use the SSH tunnel to the windows machine connecting to the rsyncd deamon and successfully backup the "Module". It works on about 3GB of files (fails on the Windows registry files as suspected).
My belief is that rsync over ssh is tickling a deadlock race condition in cygwin. See this message and trace it backwards for more context: http://cygwin.com/ml/cygwin-patches/2005-q1/msg00015.html I have recently re-volunteered to the author to help out getting his patches tested but have heard nothing yet. If this is the cause, then it is deep in the cygwin interaction with some ill-defined system calls for queues. Note that one possible workaround is to "push" from the Windows system rather than to "pull" it from another system, although this is not always possible because of firewalls.
I certainly suspected that this was a problem in the cygwin pipe/socketpair handling. Thanks for the extra testing Chris, and for the work on getting this fixed in cygwin Jim!
I'm marking the cygwin hang bugs as "LATER" because this is a bug is in the cygwin pipe code, so it is outside rsync's control. We'll revisit this issue later after we hear that the cygwin code has been fixed. I wonder if specifying a --bwlimit might work around the problem by ensuring that the pipes can't fill up enough to deadlock. While we're waiting for a cygwin fix, give that a try.
I tried the bandwidth limit option with the same lock-up behavior. I think I had the setting down to 8 (8KB/s?). Is there a thread or bug reference with CYGWIN?
(In reply to comment #9) > Is there a thread or bug reference with CYGWIN? The previously mentioned link is the best reference: http://cygwin.com/ml/cygwin-patches/2005-q1/msg00015.html Cygwin seems to only sort of use bugzilla. The community prefers to use the various mailing lists to track and work things out.
The legendary cygwin/rsync/ssh hang problem. I have been tracking this for a while now and can say that the latest cygwin install appears to have fixed the problem on one of the setups that has consistantly failed in the past. Have not put the update on to any production boxes yet, but it looks promising. From the threads that I have read on the cygwin mailing lists, it would seem that a pipe problem in the cygwin1.dll has been resolved (non-blocking pipes that blocked?) The relevant cygcheck -s info: $ cygcheck -s Cygwin Configuration Diagnostics Current System Time: Wed Apr 13 14:30:19 2005 Windows XP Home Edition Ver 5.1 Build 2600 Service Pack 2 . . Cygwin DLL version info: DLL version: 1.5.14 DLL epoch: 19 DLL bad signal mask: 19005 DLL old termios: 5 DLL malloc env: 28 API major: 0 API minor: 126 Shared data: 4 DLL identifier: cygwin1 Mount registry: 2 Cygnus registry name: Cygnus Solutions Cygwin registry name: Cygwin Program options name: Program Options Cygwin mount registry name: mounts v2 Cygdrive flags: cygdrive flags Cygdrive prefix: cygdrive prefix Cygdrive default prefix: Build date: Fri Apr 1 13:40:00 EST 2005 Shared id: cygwin1S4 . . rsync 2.6.3-1 . . openssh 4.0p1-1 Regards, Mike
(In reply to comment #11) Have you had a chance to further test rsync with cygwin? I just updated cygwin on a Windows XP Pro machine and tried several times to initiate rsync over SSH from Debian Sarge; unfortunately, the transfer still hangs. I can use a SSH tunnel initiated from Debian and then start a sync using the rsyncd (deamon) running on WinXP. Cygwin DLL version info: DLL version: 1.5.14 DLL epoch: 19 DLL bad signal mask: 19005 DLL old termios: 5 DLL malloc env: 28 API major: 0 API minor: 126 Shared data: 4 DLL identifier: cygwin1 Mount registry: 2 Cygnus registry name: Cygnus Solutions Cygwin registry name: Cygwin Program options name: Program Options Cygwin mount registry name: mounts v2 Cygdrive flags: cygdrive flags Cygdrive prefix: cygdrive prefix Cygdrive default prefix: Build date: Fri Apr 1 13:40:00 EST 2005 Shared id: cygwin1S4 openssh 4.0p1-1 rsync 2.6.3-1 -- Chris Finley
(In reply to comment #0) I have windows XP Professional installed ( new installation last week) with the rsync 2.6.3. Before the rebuild I was using rsync OK ( sorry, dont know which version!) from XP to Solaris rsync daemon ( version 2.5.5 ) with no problems. I now get the same lock up/ hang etc. as Chris describes but the only way out is to power off the XP machine, killing the rsync doesn't help since it appears that the network connection has been trashed. The command is a simple: rsync -va <filename> <machine>::Dir/. rsync works OK when I mount the remote drive locally ( destination is /cygdrive/h/Dir/.) or if I use <username>@<machine>:/home/<machine>/Dir/. - this goes through ssh with no problems.