receiving machine: rsync-2.6.8-1.FC5.1 (Fedora Core 5) sending machine: rsync-2.6.3-1 (Fedora Core 1) The following command line parameters are used on the receiving machine: rsync --rsh=ssh \ --archive \ --compress \ --update \ --recursive \ --sparse \ --progress \ --exclude-from=excludes.txt \ --partial \ --delete \ --delete-excluded \ user@sender:'/dest1 /dest2' dest-dir Two files are complained about. They are both sparse files and approximately 4G in size. When rsyncing, the following messages are produced: test1/cow 4296024064 100% 5.55MB/s 0:12:18 (xfer#22, to-check=56572/57050) WARNING: test1/cow failed verification -- update retained (will try again). test3/cow 4296028161 100% 5.50MB/s 0:12:24 (xfer#24, to-check=56569/57050) WARNING: test3/cow failed verification -- update retained (will try again). then later in the backup: test1/cow 4296024064 100% 5.97MB/s 0:11:26 (xfer#30, to-check=56572/57050) ERROR: test1/cow failed verification -- update retained. test3/cow 4296028161 100% 5.75MB/s 0:11:52 (xfer#31, to-check=56569/57050) ERROR: test3/cow failed verification -- update retained. Checking the md5sums of the files in question shows that they are not the same. sending machine: 093426c81424183de9162cc412e46eaf test1/cow 04f4c4f4cd49160ba696423dd37fe7d2 test3/cow receiving machine: 5e1cef93f8e5542a097c178bab6b3688 test1/cow 8573a3d344fadd61a38aefdc94e027f5 test3/cow A second run of rsync does not even attempt to synchronize the files: user@sender's password: receiving file list ... 57050 files to consider sent 145 bytes received 952528 bytes 20937.87 bytes/sec total size is 63950448489 speedup is 67127.39 If I remove the files in question from the receiver, and rsync again, the rsync completes normally. The md5sums also match. I see that in the changes to 2.6.7, that --inplace and --sparse can't be used together because "the sparse-output algorithm doesn't work when overwriting existing data". I'm not using --inplace so I don't think this affects me. Also, the receiver is 2.6.8, which I believe makes this irrelevent.
I noticed something really similar. Using rsync daemon, transferring a bunch of movie files. now and then a file fails with this error. Rsync is the same version on both sides, 2.6.8, from backports.debian.org. command line parameters: -a --stats --delete-excluded --delete-after --partial --password-file=/etc/rsync/aaa.passwd -W --size-only --exclude "/*/*.zip" rsync://syncuser@$HOST:/content/ /home/www/${HOST}.com/ ERROR: marie/movies/marie_320.mpg failed verification -- update retained. The resulting files' checksums are different. Sizes are the same.
on 2 systems removal and retransfer helped. On one it does not.
Another Example (with slight differences): receiving machine: cwRsync 2.0.10 (W2003) sending machine: rsync 2.5.7 (RHEL 3) bug occurs only with -z flag, but does occur consistently with large files. (also using -e "ssh" and -r options)
Hmmm... I'm having the same problems here: rsync 2.6.9 (Debian Sarge) server, version 2.6.3 at the sender. You should take a look at bug #2187 which describes exactly the same problem. However, Wayne Davison states that the bug is fixed in CVS somewhere in february 2005. I don't know which version of rsync is the first to have this patch included, but I think my version (2.6.3) is too old and still has bug #2187 included.
on file "fileio.c", line 31: static size_t sparse_seek = 0; size_t is 4 bytes (4GB), at least on my six years old P4, don't know in 64 bits systems So if there are more than 4GB consecutive zeros, the "sparse_seek" variable overflows, and things go wrong... but the size of the file changes, so I'm not sure if this is related to this thread. just change it to: static off_t sparse_seek = 0; and it works, or at least it looks like it works :). Maybe some more in deep look should be taken on the "sparse" code (also happens with l1 and l2, but I don't think anyone is going to have a +4GB buffer...) Oh! this is in, at least, rsync 3.0.4 and on my gentoo 3.0.3... maybe much earlier too...
Created attachment 3728 [details] Tweak spare_seek and a few other size_t vars Thanks, Pedro! I agree that the sparse_seek variable should be an off_t (which is an OFF_T in rsync, to support some systems where off_t isn't as bit as it should be). In looking at the size_t args write_file() and write_sparse() and how they interact with int vars, it looks to me like some of the size_t values should also be ints. This patch changes both the sparse_seek definition and makes some size_t vars ints.
Hello again! I'm not sure if the "nice" way to define them is as ints I would check what do they exactly mean, and leave them as size_t or make them OFF_T. If they relate to a buffer position or funcion handling size_ts, leave them as size_t. However if they relate to file positions, I think they should be OFF_Ts. Anyway, I think they definitely should not be ints, as they are not going to get any negative value and you are efectively halving the range in trade of nothing. However this is just my point of view, and I'm quite sure nobody needs a programming course here, so if you think ints are the way to go I'm not going to cry :)
The vars need to be consistent. Since the callers are passing int length values, and the return must be able to either return the input length or go negative (for an error), I think int is the right choice. The chunk size of a write will not overflow 31 bits (or indeed, even get close to that), so we should be fine.
Using 3.0.4 here on same machine to copy a huge (40GB) sparse file it stops on exactly 32G. I'm using rsync -S src dst. Without the -S option it completes at the correct size. I do not have either a warning nor an error even when the destination file has an incorrect size ! Timing with cp on the same hardware and same conditions: time rsync -S /Storage1/mail1.diskimg /Storage2/mail1.diskimg.backup real 15m12.434s user 5m42.237s sys 7m8.176s (with wrong file size) time cp --sparse=always /Storage1/mail1.diskimg /Storage2/mail1.diskimg.backup real 6m32.121s user 0m8.414s sys 2m45.551s This is a KVM disk image so the file should have quite long sparse zones.
You need to use 3.0.5 for the fixed sparse variable.