Bug 5795 - look into a improved tear-down processing during fatal errors
look into a improved tear-down processing during fatal errors
Status: RESOLVED FIXED
Product: rsync
Classification: Unclassified
Component: core
3.1.0
x86 Mac OS X
: P3 enhancement
: ---
Assigned To: Wayne Davison
Rsync QA Contact
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-09-25 23:40 UTC by Dave Yost
Modified: 2011-12-24 22:18 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dave Yost 2008-09-25 23:40:33 UTC
0 424 Z% rsync -av --sparse --progress --partial  /Users/yost/Documents/VMWare/winxp.vmwarevm  /Volumes/x/vmware
sending incremental file list
winxp.vmwarevm/
winxp.vmwarevm/winxp.vmdk
  1578172416  17%   11.08MB/s    0:10:35
rsync: writefd_unbuffered failed to write 4 bytes [sender]: Broken pipe (32)
rsync: connection unexpectedly closed (32 bytes received so far) [sender]
rsync error: error in rsync protocol data stream (code 12) at io.c(632) [sender=3.0.4]
12 425 Z% rsync --version
rsync  version 3.0.4  protocol version 30
Copyright (C) 1996-2008 by Andrew Tridgell, Wayne Davison, and others.
Web site: http://rsync.samba.org/
Capabilities:
    64-bit files, 32-bit inums, 32-bit timestamps, 64-bit long ints,
    socketpairs, hardlinks, symlinks, IPv6, batchfiles, inplace,
    append, ACLs, xattrs, iconv, symtimes

rsync comes with ABSOLUTELY NO WARRANTY.  This is free software, and you
are welcome to redistribute it under certain conditions.  See the GNU
General Public Licence for details.
0 426 Z%
Comment 1 Dave Yost 2008-09-27 18:23:52 UTC
It turns out there was clearly not enough space on the destination. The copy was going to take 20GB, and after the crash there was still 1GB on the destination, so I thought this was probably a protocol bug rather than a problem with running out of space.

Perhaps what this is is a situation where the remote end should have sent back an indication that it was out of space and should have shut down gracefully.

But there is another problem here: the two ends should start out by negotating whether there is enough space for the copy. In a better world, the destination OS would give the rsync process the ability to atomically grab disk resources up front to be used for the files and folders it creates, and if that fails, the remote rsync would tell the UI rsync no dice. Or how about this: writing the destination files could be transaction in the OS file system! Nah.
Comment 2 Wayne Davison 2008-10-02 13:53:49 UTC
Yes, the error reporting coming back from some errors can indeed be lacking.  However, the pipe-lined nature of the protocol can make this hard to overcome (the error can be behind so much checksum data that it can't make it back prior to the connection getting torn down).  In 3.1.0, I have a new option, --msgs2stderr, that can often be used to debug such situations (for non-daemon transfers).

It would be good to investigate a reliable way to drain (and discard) the pending data to get all the relevant messages more reliably. For instance, if a new message was added "fatal exit in progress", it could be sent and circle the 3 processes before the connection is torn down.  e.g. a write error on the receiver sends the error message (text) to the generator, sends the fatal message too, and then just discards file-change data until it gets the fatal message back from the sender.
Comment 3 Dave Yost 2008-10-02 15:47:21 UTC
Why not have the process generating the error delay exit until it gets an ack of the error message? A separate connection would be best so the pipelined data doesn't get in the way.
Comment 4 Wayne Davison 2008-10-03 00:35:00 UTC
That's what I just said.
Comment 5 Matt McCutchen 2009-11-27 20:45:47 UTC
This is fixed in the current development rsync, right?  When it gets an error writing to the socket, it continues to read for messages from the other side.
Comment 6 Wayne Davison 2011-12-24 22:18:04 UTC
The upcoming 3.1.0 tries to ensure that it drains the errors from the transfer, so hopefully this is fixed for you.