The Samba-Bugzilla – Bug 10338
Start deletion from the top of the hierarchy
Last modified: 2015-05-24 18:50:24 UTC
I make my production backups with Rsync.
Here is an example of my backup tree on the destination server :
At the end of the backup process, I upload a logfile in the backup directory and delete oldest backups.
For this, I use an include/exclude file, for example this inclexcl.txt :
I also use this empty directory where my logfile is :
And I run this rsync command :
rsync -a --delete-after --exclude-from=inclexcl.txt /tmp/path/ server::backups/
Perfect, it works, my logfile is uploaded and oldest backups are deleted (in this example all backups of March).
However, what I can see in the daemon's log is that Rsync is browsing from the top down all files of the backup directories to delete, and delete them one by one.
Rsync behavior is to make a list of the files and check if any are "protected" from deletion before it removes the file.
As each of my backup directories contains hundreds of thousands of files, deletion take a very long time.
I think that it could be interesting to have the ability to skip this check, to ask Rsync to start deletion from the top of the hierarchy.
Perhaps adding a new option (--delete-from-top) ?
It would for sure speed up deletion of huge directories, for instance common daily backup directories (made for example with --link-dest), saving Rsync overhead.
Thank you very much for this improvement !
The files have to be deleted one by one anyway, I'm not sure how much this could be improved.
Have you compared how long a simple rm -r $TOPDIR takes, compared to rsync? Make sure to flush any disk buffers / cache before running your tests (echo 3 > /proc/sys/vm/drop_caches; if you're running linux).
Paul, yes you're right, files have to be deleted on by one, but perhaps Rsync overhead could be skipped.
I made some tests, I created a directory with 300k files in it.
I deleted it with Rsync, and with rm command.
I did this test several times.
On my server, it took about 20 seconds with rm, about 40 seconds with Rsync.
Of course this is just a test example, each of the top directories of my production backup are much bigger, impact is then more important.
Any news regarding this enhancement request ?
Thank you very much !