it seems Bug# 1529 has not been resolved totally. When I rsync a large file greater than 4G with -z, I keep getting verification failed. If I remove -z option, everything becomes ok. My envirnment is windows 2000 adv server with latest cygwin.
Created attachment 869 [details] error log
by the way, I'm using rsync in cygwin 1.5.12-1.
I'd suggest adding "--partial" to the options and comparing the copied file (which will not be discarded) with a valid version of the file. If the corruption is minor, seeing what is different might clue us in as to what is going wrong. If the corruption is major, try adding a couple more files to the command-line after the failing file -- if rsync experiences some kind of a protocol failure, then there is probably a problem with the bytes making it over the socket. If it transfers the following files OK, we'll need to do more debugging. Is this problem unique to the one file you mentioned? Or does it fail for all files?
I think it's a common problem. In fact I have been transfered a list of files which become larger everyday for a long time using rsync. When XYSHAGLS.BAK was not larger than 4GB, everything was ok. But when the size beyond 4GB, I found my script got the verification failed error, which I considered as Bug# 1529. After Bug# 1529 was resolved, I found I still got that error. So I try to remove -z in the parameters. It works. Now, I transfer other files which are all small than 4GB with -z, transfer this file without -z. If without -z, Transfering speed is very very slow.
I added --partial to the options and got two files same in size,but not same in md5 checksum. And then I did the following test. First, I made two test files, test and test3.dat. Test is bigger than 4GB. Test3.dat is smaller than 4GB. Second, I copy the two files to destinction. Then in the source I added a file at the end of the two files by this command: copy /B test+somefile test Then I rsync them to the destinction: $ rsync -az -vvv backup@jf-deptest::backup/test* ./ opening tcp connection to jf-deptest port 873 Password: receiving file list ... recv_file_name(test) recv_file_name(test3.dat) received 2 names done recv_file_list done get_local_name count=2 ./ generator starting pid=1784 count=2 delta transmission enabled recv_generator(test,0) generating and sending sums for 0 count=75314 rem=8792 blength=75304 s2length=4 flength=5671378944 recv_files(2) starting recv_generator(test3.dat,1) generating and sending sums for 1 count=53255 rem=20480 blength=53248 s2length=4 flength=2835689472 recv_files(test) test recv mapped test of size 5671378944 got file_sum WARNING: test failed verification -- update discarded (will try again). generate_files phase=1 recv_generator(test,0) generating and sending sums for 0 count=75314 rem=8792 blength=75304 s2length=16 flength=5671378944 recv_files(test3.dat) test3.dat recv mapped test3.dat of size 2835689472 got file_sum set modtime of .test3.dat.002916 to (1104284751) Wed Dec 29 09:45:51 2004 renaming .test3.dat.002916 to test3.dat recv_files phase=1 generate_files phase=2 generate_files finished recv_files(test) test recv mapped test of size 5671378944 got file_sum ERROR: test failed verification -- update discarded. recv_files finished sent 2535000 bytes received 4094667 bytes 1429.89 bytes/sec total size is 8526412800 speedup is 1286.10 rsync error: some files could not be transferred (code 23) at /home/lapo/packagi ng/tmp/rsync-2.6.3/main.c(1146) _exit_cleanup(code=23, file=/home/lapo/packaging/tmp/rsync-2.6.3/main.c, line=11 46): about to call exit(23) I got the same error. Hope this can help.
I did another test. I got a file which is about 180M. See the following test log: $ rsync -az -vvv -B 65535 backup@jf-deptest::backup/cscnc1.dmp ./ opening tcp connection to jf-deptest port 873 Password: receiving file list ... recv_file_name(cscnc1.dmp) received 1 names done recv_file_list done get_local_name count=1 ./ generator starting pid=2720 count=1 delta transmission enabled recv_generator(cscnc1.dmp,0) generating and sending sums for 0 count=2846 rem=47953 blength=65535 s2length=3 flength=186495028 recv_files(1) starting generate_files phase=1 recv_files(cscnc1.dmp) cscnc1.dmp recv mapped cscnc1.dmp of size 186495028 got file_sum set modtime of .cscnc1.dmp.002512 to (1104401967) Thu Dec 30 18:19:27 2004 renaming .cscnc1.dmp.002512 to cscnc1.dmp recv_files phase=1 generate_files phase=2 generate_files finished recv_files finished sent 20063 bytes received 6491 bytes 689.71 bytes/sec total size is 186495054 speedup is 7023.24 _exit_cleanup(code=0, file=/home/lapo/packaging/tmp/rsync-2.6.3/main.c, line=114 6): about to call exit(0) administrator@jf-deptest ~ $ rsync -az -vvv -B 65536 backup@jf-deptest::backup/cscnc1.dmp ./ opening tcp connection to jf-deptest port 873 Password: receiving file list ... recv_file_name(cscnc1.dmp) received 1 names done recv_file_list done get_local_name count=1 ./ generator starting pid=2408 count=1 delta transmission enabled recv_generator(cscnc1.dmp,0) generating and sending sums for 0 count=2846 rem=45134 blength=65536 s2length=3 flength=186495054 recv_files(1) starting generate_files phase=1 recv_files(cscnc1.dmp) cscnc1.dmp recv mapped cscnc1.dmp of size 186495054 got file_sum WARNING: cscnc1.dmp failed verification -- update discarded (will try again). recv_generator(cscnc1.dmp,0) generating and sending sums for 0 count=2846 rem=45134 blength=65536 s2length=16 flength=186495054 recv_files phase=1 generate_files phase=2 generate_files finished recv_files(cscnc1.dmp) cscnc1.dmp recv mapped cscnc1.dmp of size 186495054 got file_sum ERROR: cscnc1.dmp failed verification -- update discarded. recv_files finished sent 77003 bytes received 12189 bytes 1382.82 bytes/sec total size is 186495067 speedup is 2090.94 rsync error: some files could not be transferred (code 23) at /home/lapo/packagi ng/tmp/rsync-2.6.3/main.c(1146) _exit_cleanup(code=23, file=/home/lapo/packaging/tmp/rsync-2.6.3/main.c, line=11 46): about to call exit(23) Please notise when block size is set beyond to 64K, it failed. I don't know whether it's the cause of the problem. I also did the other experience. see the log: $ rsync -a -vvv -B 65536 backup@jf-deptest::backup/cscnc1.dmp ./ opening tcp connection to jf-deptest port 873 Password: receiving file list ... recv_file_name(cscnc1.dmp) received 1 names done recv_file_list done get_local_name count=1 ./ generator starting pid=864 count=1 delta transmission enabled recv_generator(cscnc1.dmp,0) generating and sending sums for 0 count=2846 rem=45134 blength=65536 s2length=3 flength=186495054 recv_files(1) starting generate_files phase=1 recv_files(cscnc1.dmp) cscnc1.dmp recv mapped cscnc1.dmp of size 186495054 got file_sum set modtime of .cscnc1.dmp.002716 to (1104402055) Thu Dec 30 18:20:55 2004 renaming .cscnc1.dmp.002716 to cscnc1.dmp recv_files phase=1 generate_files phase=2 generate_files finished recv_files finished sent 20062 bytes received 56714 bytes 642.48 bytes/sec total size is 186495067 speedup is 2429.08 _exit_cleanup(code=0, file=/home/lapo/packaging/tmp/rsync-2.6.3/main.c, line=114 6): about to call exit(0) I removed the -z option and used 65536 as block size, it passed this time.
Thanks for the additional details -- it has been helpful. I have found a problem with the sending of compressed tokens when the blocksize is larger than the CHUNK_SIZE, and am working on an appropriate fix. When that is done, we can see if it fixes the original verification problem you reported (which I haven't been able to duplicate).
I can confirm this bug under both AIX 5.2 and NetBSD 2.0, with 2.6.3 and CVS-HEAD as of 2005- 01-21. I also have a somewhat manageable test case which results in "verification failed" - a pair of 10 MB files available at: ftp://stix.homeunix.net/pub/rsyncbug2187/file1.gz ftp://stix.homeunix.net/pub/rsyncbug2187/file2.gz Test blocksize is as reported by rsync on an actual transfer of a file a little over 4 GB. Test command was: ksh$ rsync -B 66552 -acvvz --stats --inplace -e /usr/bin/rsh /tmp/z/file1 localhost:/tmp/z/file2 opening connection using /usr/bin/rsh localhost rsync --server -vvlogDtprcz -B66552 --inplace . /tmp/z/file2 building file list ... [sender] expand file_list to 131072 bytes, did move done delta transmission enabled file1 WARNING: /tmp/z/file2 failed verification -- update retained (will try again). file1 total: matches=306 tag_hits=1898 false_alarms=0 data=665520 ERROR: /tmp/z/file2 failed verification -- update retained. Number of files: 1 Number of files transferred: 2 Total file size: 10485760 bytes Total transferred file size: 20971520 bytes Literal data: 665520 bytes Matched data: 20306000 bytes File list size: 61 Total bytes sent: 290611 Total bytes received: 4168 sent 290611 bytes received 4168 bytes 196519.33 bytes/sec total size is 10485760 speedup is 35.57 rsync error: some files could not be transferred (code 23) at main.c(702) ksh$ cmp /tmp/z/file? /tmp/z/file1 /tmp/z/file2 differ: char 3726913, line 11513
Created attachment 965 [details] Fix problem with implicit (unsent) data I was able to reproduce your test case (much appreciated!) and the result is the fix in the attached patch. Apparently the zlib data handling is sensitive to how the implicit (unsent) data is broken up. The receiving side is kluging up a literal-data zlib header for unsent data, and this must be 65535 bytes or less in length. The sending side was throwing a whole block of data at the deflate() call, which apparently creates a potential mismatch in the data streams if the blocksize is larger than 65535 bytes. I changed the code to break up the blocks on the sending side in the same increments that the receiving side uses, and it fixed the problem. I hope the fix works for all large block sizes that were having a problem.
Checked fix into CVS.