Bug 10380 - Non-Nested Folder Optimisation
Summary: Non-Nested Folder Optimisation
Alias: None
Product: rsync
Classification: Unclassified
Component: core (show other bugs)
Version: 3.1.0
Hardware: All All
: P5 enhancement (vote)
Target Milestone: ---
Assignee: Wayne Davison
QA Contact: Rsync QA Contact
Depends on:
Reported: 2014-01-14 17:22 UTC by Haravikk
Modified: 2014-01-20 14:03 UTC (History)
0 users

See Also:


Note You need to log in before you can comment on or make changes to this bug.
Description Haravikk 2014-01-14 17:22:12 UTC
One handy feature of most (all?) unix file systems is that if the contents of a folder are changed, then the folder's modified time will be updated as well, which presents an opportunity for optimised comparisons.

Quite simply, if a folder's modified time is the same as the corresponding folder in the destination, then no comparison of its contents needs to be performed.
At least, that is true if the folder doesn't contain any nested folders, as modifications to these do not affect the parent folder's modified time.

What I would like to propose is that when rsync begins generating a file list for a destination directory, that it should attempt to mark any directories that do no contain nested directories. When it comes time to compare such a directory, the comparison can be done using only the modified times (if rsync is operating in that mode), and only needs to compare file lists if these times differ. Provided rsync is able to generate enough of the destination file-list in advance, then this should allow many unchanged folders to be skipped in their entirety, without having to delve further into their contents.

This feature could be enhanced by the use of a metadata file (see bug 10379) by storing a flag in a destination folder if it contains no nested directories. This way, so long as the metadata file is valid, there is no need to process the directory's file-list before an optimised comparison is performed.
Comment 1 Wayne Davison 2014-01-19 22:10:43 UTC
Changed files don't affect a directory's mtime, only new files. Overall it's not a very reliable way to try to optimize file transfers.
Comment 2 Haravikk 2014-01-20 12:40:05 UTC
Are you sure? It seems to update the folder mtime on HFS+ at least, but if it doesn't work on other file systems then yeah you're right, maybe not worth it.
Comment 3 Kevin Korb 2014-01-20 14:03:40 UTC
That is an HFS+ "feature".  It is why Apple's Time Machine backup system works faster than rsync.  They do utilize this optimization but it would only work on HFS+.