Bug 9813 - --resume parameter to improve speed of dropped/partial transfers
Summary: --resume parameter to improve speed of dropped/partial transfers
Status: NEW
Alias: None
Product: rsync
Classification: Unclassified
Component: core (show other bugs)
Version: 3.1.0
Hardware: All All
: P5 enhancement (vote)
Target Milestone: ---
Assignee: Wayne Davison
QA Contact: Rsync QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-04-18 17:40 UTC by Haravikk
Modified: 2013-04-18 17:40 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Haravikk 2013-04-18 17:40:06 UTC
Okay, so I know rsync has the --partial flag and --partial-dir for better handling long file transfers that may be interrupted, however this does nothing to remove the overhead of resuming a dropped rsync transfer, i.e - if you have 100,000 files and the transfer stalls on the last one, then running the command again will cause the 99,000 successful files to be rechecked before the partial one is actually resumed.


What I'd like to propose is a set of --resume parameters, when set these will cause rsync (or rather the rsync receiver) to attempt to write a file before closing that describes where in the transfer it go to. If a new rsync transfer begins and such a file exists, then rsync will check to see if the settings are similar (i.e - nothing conflicts) and use the information to resume where it left off. By resume what I mean is that it would ignore all files that are earlier (alphabetically) than the last file being transferred, as well as directories in the tree that likewise came "before" the current file.

The implication of this is that an rsync of a hierarchy that got interrupted half-way through, would resume where it was and then finish the final half of the structure; ideal for static transfers (i.e - nothing should have drastically changed between then and now).

In order to control this there would be some other parameters:

--resume    enables resuming of a previous transfer
--resume-nocheck   resumes the previous transfer even if the parameters don't match (resume will however fail if the "current" file is not matched by the new parameters)
--resume-recheck   resumes as normal, but rechecks all "earlier" files upon completion, e.g - if files 1-100 are skipped when resuming, then they will be rechecked but at the end of the transfer
--resume-threshold   if the previous transfer was stopped more than X seconds ago then it will not be resumed, useful for a dirty comparison of a "stale" starting point
--resume-file   specifies the location of the resume file (defaults to a hidden file in the same location as the source and target)


The basic aim is to allow rsync to jump to a point in the previous transfer which is known to have changed/new files, so that it can immediately continue from that point, rather than having to process everything up to that point first. Both ends of the transfer will attempt to store a resume file, and if either side has one then they will use this to skip ahead in the transfer (if possible).