The Samba-Bugzilla – Bug 5795
look into a improved tear-down processing during fatal errors
Last modified: 2011-12-24 22:18:04 UTC
0 424 Z% rsync -av --sparse --progress --partial /Users/yost/Documents/VMWare/winxp.vmwarevm /Volumes/x/vmware
sending incremental file list
1578172416 17% 11.08MB/s 0:10:35
rsync: writefd_unbuffered failed to write 4 bytes [sender]: Broken pipe (32)
rsync: connection unexpectedly closed (32 bytes received so far) [sender]
rsync error: error in rsync protocol data stream (code 12) at io.c(632) [sender=3.0.4]
12 425 Z% rsync --version
rsync version 3.0.4 protocol version 30
Copyright (C) 1996-2008 by Andrew Tridgell, Wayne Davison, and others.
Web site: http://rsync.samba.org/
64-bit files, 32-bit inums, 32-bit timestamps, 64-bit long ints,
socketpairs, hardlinks, symlinks, IPv6, batchfiles, inplace,
append, ACLs, xattrs, iconv, symtimes
rsync comes with ABSOLUTELY NO WARRANTY. This is free software, and you
are welcome to redistribute it under certain conditions. See the GNU
General Public Licence for details.
0 426 Z%
It turns out there was clearly not enough space on the destination. The copy was going to take 20GB, and after the crash there was still 1GB on the destination, so I thought this was probably a protocol bug rather than a problem with running out of space.
Perhaps what this is is a situation where the remote end should have sent back an indication that it was out of space and should have shut down gracefully.
But there is another problem here: the two ends should start out by negotating whether there is enough space for the copy. In a better world, the destination OS would give the rsync process the ability to atomically grab disk resources up front to be used for the files and folders it creates, and if that fails, the remote rsync would tell the UI rsync no dice. Or how about this: writing the destination files could be transaction in the OS file system! Nah.
Yes, the error reporting coming back from some errors can indeed be lacking. However, the pipe-lined nature of the protocol can make this hard to overcome (the error can be behind so much checksum data that it can't make it back prior to the connection getting torn down). In 3.1.0, I have a new option, --msgs2stderr, that can often be used to debug such situations (for non-daemon transfers).
It would be good to investigate a reliable way to drain (and discard) the pending data to get all the relevant messages more reliably. For instance, if a new message was added "fatal exit in progress", it could be sent and circle the 3 processes before the connection is torn down. e.g. a write error on the receiver sends the error message (text) to the generator, sends the fatal message too, and then just discards file-change data until it gets the fatal message back from the sender.
Why not have the process generating the error delay exit until it gets an ack of the error message? A separate connection would be best so the pipelined data doesn't get in the way.
That's what I just said.
This is fixed in the current development rsync, right? When it gets an error writing to the socket, it continues to read for messages from the other side.
The upcoming 3.1.0 tries to ensure that it drains the errors from the transfer, so hopefully this is fixed for you.