I run a mirror server for various projects and rsync generating directory listings for incoming connections takes a good deal of the available ressources (mainly disk I/O). Increasing the directory + inode cache is a no-go when having ~3 million inodes/directory entries.
What I'd like to see is an option for rsync when run as a client and updating local files, to generate a directory listing which rsync run as a daemon can parse and serve out instead of having to generate the directory listing itself.
There would be undoubtedly time windows where the data in the generated directory listing would be out of date (when a child process is updating the directory tree in question), but that's true too when the directory gets modified after the daemon has sent the directory listing. There is an additional risk of the dirlisting file getting corrupted and/or outdated, but that's negligible compared to the speed gain.
Rsync run as client:
rsync -a --delete-after --generate-dirlist=/home/foo/rsync-foo.lst foo.org::foo /home/foo/rsyncdata
path = /home/foo/rsyncdata/
list = yes
dirlisting file = /home/foo/rsync-foo.lst
Please give this request a good thought, it might not be "nice" or "the correct way" to tackle this problem and there are probably not many setups where this feature makes sense, but it shouldn't be too hard to implement and will help many mirror admins around the world, especially those who carry many projects (also rsync and samba :P). If you should need more opinions/voices on this I can run this feature request along the mirror-admin mailinglists of a few larger projects.
On second thought we probably can reduce the chance of corrupted dirlist files even more by marking them as "bad" (or just deleting them) when rsync notices that a file requested by a client doesn't exist anymore (or has changed since the generation of the listfile).