Running local to local, hard-link support has exhibited no obvious problems.
Running local to remote, a file hard-linked, say 153 times, would be sent 153 times and stored on the remote 153 times. Both ends running 3.0.2.
This seems to be independent of recursion options - exhibited without -r, with -r, and with -r --no-i-r
- interaction between 3.* and 2.*
- running remote to remote, or remote to local
Running local 2.6.9 to remote 3.0.2 works.
Running local 3.0.2 to remote 3.0.2 works for _some_ hardlinks. I haven't isolated which conditions cause failure.
Any more info on this?
Created attachment 3431 [details]
output from 2.6.9 to 3.0.3 showing hardlink success
Created attachment 3432 [details]
output from 3.0.3 to 3.0.3 showing hardlink failure (e.g. no => lines)
Things to try:
Use --protocol=29 on the 3.0.3 -> 3.0.3 transfer and see if that makes the hard links work.
Use 3.1.0dev on both systems (either from the git repository or the latest nightly tar file) and use the --debug=hlink4 option (with no need for so much general verbosity, and no --protoocl=29 option either) and that might help to illuminate what is happening.
--protocol=29 with 3.0.3 -> 3.0.3 didn't work
Using --debug=hlink4 with rsync-HEAD-20080727-2332GMT both ends didn't work either.
# rsync -e rsh -avvRHW --debug=hlink4 --delete-after /rescue remotemachine:/
opening connection using: ...
total: matches=0 hash_hits=0 false_alarms=0 data=652232880
deleting in rescue
sent 652320746 bytes received 2922 bytes 17871881.32 bytes/sec
total size is 652232880 speedup is 1.00
Were there some changes for the rsync to update?
I'd like to see the debug output that was generated by the --debug=hlink4 run. And know what files should have been hard-linked that weren't.
I'm emptying out the destination directory before each run (though there are the same results leaving one or two files in there).
There's no difference in the output shown with --debug=hlink4 (except the first line redisplaying the running options and the stats line at the end)
Am I missing some compile time thing?
Running with --debug=all4 gives no '=>' outputs. Is that still supposed to be occuring near the end of phase 1?
Created attachment 3438 [details]
Adding 1 to the dev number to avoid a 0
Interesting. I think that can mean only one thing: you have a device that has a number of 0. Try running this perl command after you cd into the source dir:
perl -e 'print "dev: ", (stat(".")), "\n"'
If it prints "dev: 0", that is the problem and this patch should solve it.
Fantastic. That is indeed the situation. "/" mount on NetBSD seems to always have device as 0 (at least on all my machines). The patch did work for me.
I do have a device 0x80000002. I guess it's conceivable that someone could have a device 0xffffffff with 32bit device numbers - seems pretty unlikely though. If it's only used internally I'm wondering why bother overloading it.
Still, I'm quite happy with it :-)
I have also added a cast to int64 (which is the internal type used for all device and inode numbers) so that a 32-bit device number with all bits on will still be non-zero in the hard-link processing (as long as a dev_t is unsigned, which it should be). Of course, a 64-bit device number with all bits on will overflow to zero, but the current hash code must reserve one number (out of the 18,446,744,073,709,551,615 values available) to indicate that a hash position is empty, and so choosing an st_dev of 0xffff_ffff_ffff_ffff as the odd-man-out seems like the best choice.
This fix will go out in the next 3.0.4 pre-release (which will probably be the final pre-release for 3.0.4).