There are compelling reasons to use rsync as a backup tool; then snapshot the destination fs to preserve the current backup; and save the next backup to the same destination, again using rsync. In this scenario, the data in the backup filesystem is only ever changed by rsync. If there are many files, a backup run will take a very long time and most I/O will be spent in reading the metadata of files to see if the source is different from the destination: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 45.85 0.627125 31 20222 lstat 30.61 0.418682 20 20222 lgetxattr 13.54 0.185181 79 2338 getdents64 3.23 0.044241 22 1982 1982 getxattr 2.34 0.032001 16 1982 stat 1.78 0.024293 20 1169 openat 1.25 0.017112 14 1169 close 1.05 0.014389 12 1169 fstat 0.27 0.003737 19 187 brk 0.04 0.000503 45 11 write 0.02 0.000306 27 11 read 0.01 0.000159 14 11 select ------ ----------- ----------- --------- --------- ---------------- 100.00 1.367729 50473 1982 total If rsync could be told to save all metadata to some "database" in addition to the filesystem, the load on the backup server on subsequent backups of the same source data to the same destination could be much lower. The "database" could be read into RAM, perhaps in chunks if it's very large, and checking metadata for changes would be almost free. Of course, if data is changed in the actual filesystem by a tool other than rsync (which would keep the "database" updated), the "database" gets out of sync, but that can't be helped. This could also be an enhancement of "fake super" -- instead of saving metadata in an xattr for each file separately, all metadata could be saved in a single file, in a location outside the root of the rsync module (or, to support chroot, inside it, but hidden from rsync transfers).
It's completely fine if using this "database" in writable modules implies or requires `max connections = 1` to avoid concurrency/locking issues.