The Samba-Bugzilla – Bug 3461
rsync is not atomic when temp-dir is on different device
Last modified: 2006-03-12 02:56:59 UTC
I do have a problem when rsync'ing files when I specify --temp-dir on a different device than the destination.
# rsync --temp-dir /disk2 remote::file.dat file.dat
When the transfer finishes, the temporary file on /disk2 is copied directly to /disk1/file.dat resulting in /disk1/file.dat being truncated and gradually filled. When temp-dir is on the same device, the new file is created with a temporary name and then renamed, so that at no point /disk1/file.dat is garbled.
The problem could be alleviated by letting reboust_rename() in util.c optionally copy the file to a temporary name on the target device, and then rename it.
This is actually quite a problem for us, we want the temporary file on a different device for performance. When it's on the same device, rsync creates both heavy read and heavy write traffic to the same disk (which is otherwise quite idle), resulting in poor performance (much poorer than what the extra copy incurs.)
I don't see how using a --temp-dir on a different device could make the transfer faster, if indeed it does. At some point, all of the data of file.dat must go from memory to disk 1. Using a --temp-dir on a different device, with or without your proposed atomicity fix, does not avoid this bottleneck; after the additional round trip to and from disk 2, the data must be written to disk 1 as before.
This bug brings up an interesting point. I wasn't aware that robust_rename copied files between filesystems when necessary, and I don't think it should. Copying data into an existing destination file has the same potentially undesirable characteristics as --inplace, not least of which is non-atomicity, but the user is never warned.
If the temp dir is on a different filesystem from the eventual location of a destination file, what should rsync do? Ignoring the temp dir and using a temporary file next to the destination could be hazardous because some users use a secure temp dir to avoid races in an insecure destination directory. Using two temporary files, one in the temp dir and one next to the destination, is strictly worse than using a single file next to the destination, unless the disk performance is anomalous like yours.
The current behavior of copying into the existing destination file should only be used if the user is aware of the consequences. To specify that she is, she could give --inplace in addition to --temp-dir=XXX; the sole effect of --inplace would be to allow this behavior. If --temp-dir is given without --inplace and this situation arises, I think rsync should produce an error, skip the file, and say "some files could not be transferred".
I was a bit unclear, it's not that it's just a different device, it's a different physical disk.
For a dataset of about 45GB we see an rsync time of about 2 hrs. When we use a temp-dir on another disk it drops to about 30 minutes.
We have measured it and traced the performance problem down to the following:
When the temporary file is on the same disk as the target, rsync must read the old file and the network, it also must write the temporary file, to the same disk as it reads from. It turns out that both the read and write speed of the disk drops substantially compared to separate disks. This is probably because the disk head needs to move between reads and writes, it doesn't move infinitely fast.
With separate disks, there's a single-stream read from one disk and a single-stream write to the other disk. When copying afterwards, it's still single-streams to each disk, and we get close to 40MB/s. When we do intertwined reads and writes to the same disk, we get far less, in some cases only 5MB/s. Even with the extra copy, the dual-disk solution runs faster, the only problem being non-atomicity.
Hmmm. It's true that rsync interleaves memory-mapped reads from the basis file and writes to the temporary file. A smart disk scheduler should avoid making the head dart back and forth by saving up all the writes to be done later a single stream. However, your data set is probably so large that the disk scheduler is nervous to keep it all in write-behind cache. You might be able to get the same improved performance by messing with disk parameters. If you find a way to set the write-behind cache size on disk 1, you could set it to several gigabytes and then allocate a several-gigabyte swap partition on disk 2, in which case the fancy maneuver involving disk 2 would be done at the kernel level instead of by rsync.
Some other options to consider:
* Hack your own copy of rsync to use a second temporary file.
* Create a batch file on disk 2 with --only-write-batch. Then apply it with --read-batch. That may or may not help.
* Run two passes of rsync. Consider this:
rsync -rt remote::files/ /disk2/tmp/ --compare-dest=/disk1/files/
rsync -a remote::files/ /disk1/files/ --copy-dest=/disk2/tmp/
I know of some users that have used --temp-dir in the past to tell rsync to avoid ever creating a tempory file in the dest-dir itself, so I think that the user should have to ask for a temp file to ever be placed in a destination directory when --temp-dir is in effect.
One way to do this that already works is to use the --delay-updates option. This makes all the updates happen in rapid succession at the end of the transfer, so any copying that is done from the --temp-dir is done into the active partial-file subdir (default: .~tmp~).
Another possibility is to use the --partial-dir option that can already be used to tell rsync that a partial file should be placed in a subdir of the destination dir. If we enhance robust_rename() a little, it could be made to perform the copy to the file inside the partial-dir, and the code could rename it from there to the parent dir.
Created attachment 1712 [details]
Make the --partial-dir get used when copying
As mentioned, if --temp-dir is combined with a --partial-dir=.~sub~, rsync will not copy a file over the destination file directly, but instead copy the file into the partial-dir, and then rename it over the destination.
The traditional use of --temp-dir (I think) is for situations when the receiving partition does not have enough free space to accomodate a temporary copy of a large file. Thus, --temp-dir is usually on a different partition, and the ensuing copy, by necessity, truncates the target file before copying the updated content.
I would suggest supplementing the --temp-dir documentation with an explanation of the traditional usage and the above behavior in the case when --temp-dir is on a different partition. When the receiver is an active system with live processes using the files being rsynced, this is an especially important consideration.
(In reply to comment #6)
> The traditional use of --temp-dir (I think) is for situations when the
> receiving partition does not have enough free space to accomodate a temporary
> copy of a large file.
Quite so. Even removing the destination file before doing a copy to a temp-file name in the destination dir might duplicate the disk space if someone had the destination file open.
A possible improvement for this option would be to set a minimum file-size that should use the temp-dir setting. This way rsync could update smaller files atomically, and only resort to using the temp-dir on another drive for really large files.
Re: the patch in comment #5: it doesn't work for a space-conscious user that has set an absolute --partial-dir path (such as using the same directory as the --temp-dir setting), so the patch needs extra logic to ensure that the copy-into-a-partial-dir logic only triggers for relative paths (which is a simple, 1/2-line change).
FYI, I checked in some doc-improvments for the --temp-file option.
I've checked-in an improved version of the change in comment #5 that allows someone to use --temp-dir in combination with a relative --partial-dir to indicate that space is not that tight on the destination drive, so any copies from the temp-dir are made atomically through the partial-dir.