Bug 1937 - timeout in data send/receive
Summary: timeout in data send/receive
Status: CLOSED FIXED
Alias: None
Product: rsync
Classification: Unclassified
Component: core (show other bugs)
Version: 2.6.3
Hardware: Sparc Solaris
: P3 normal (vote)
Target Milestone: ---
Assignee: Wayne Davison
QA Contact: Rsync QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-10-15 03:23 UTC by Ade Rixon
Modified: 2005-04-01 11:21 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ade Rixon 2004-10-15 03:23:38 UTC
Using rsync with SSH 3.9p1 on Solaris 8 to transfer files from a series of hosts
in a script. We see random timeout errors on every run since updating to 2.6.2
from 2.6.0:
rsync error: timeout in data send/receive (code 30) at io.c(143)
rsync: connection unexpectedly closed (147189 bytes read so far)
rsync error: error in rsync protocol data stream (code 12) at io.c(342)
rsync: connection unexpectedly closed (701 bytes read so far)
rsync error: error in rsync protocol data stream (code 12) at io.c(342)

Running rsync with:
SSH='ssh -oCipher=blowfish-cbc'
rsync -azq -e "${SSH}" --timeout=60 host:/dir/ /localdir/

Here's the end of a truss on the remote side:
8050:   write(1, "FC0F\007E3 ~ { z z sFADD".., 4096)    = 4096
8050:   time()                                          = 1097834603
8050:   poll(0xFFBE6058, 1, 60000)      (sleeping...)
8050:   poll(0xFFBE6058, 1, 60000)                      = 0
8050:   time()                                          = 1097834663
8050:   sigaction(SIGUSR1, 0xFFBE5F50, 0xFFBE5FD0)      = 0
8050:   sigaction(SIGUSR2, 0xFFBE5F50, 0xFFBE5FD0)      = 0
8050:   poll(0xFFBE48A0, 1, 60000)      (sleeping...)
8050:   poll(0xFFBE48A0, 1, 60000)                      = 0
8050:   time()                                          = 1097834723
8050:   sigaction(SIGUSR1, 0xFFBE4798, 0xFFBE4818)      = 0
8050:   sigaction(SIGUSR2, 0xFFBE4798, 0xFFBE4818)      = 0
8050:   poll(0xFFBE30E8, 1, 60000)      (sleeping...)
8050:   poll(0xFFBE30E8, 1, 60000)                      = 1
8050:   write(1, " A\0\0\b r s y n c   e r".., 69)      = 69
8050:   time()                                          = 1097834738
8050:   llseek(0, 0, SEEK_CUR)                          Err#29 ESPIPE
8050:   _exit(30)

In this case, the transfer failed only a few files in.

This looks similar to bug #1476.
Comment 1 Wayne Davison 2005-02-25 16:21:54 UTC
Hopefully this bug is the one I just checked a fix in for.  In the bug I just
fixed, the generator was taking too long to send data to the receiver, causing
it to timeout.  If you can still reproduce this in the latest CVS source (when
talking via protocol 29), reopen this.