We have run into an issue where the rsync daemon socket listen queue (accept queue) is filling before it is able to accept() the incoming connection requests. It appears that the rsync source code currently hard codes this limit to 5.
On our deployment, we push our changes via dsh and sync to hundreds of servers for Wikipedia in as fast a manner as possible. This is causing the systems to back up in the listen backlog and failing multiple servers as they eventually time out, being ignored by the rsync server in question. The socket.c source shows this hard limit (presently set to 5). We were able to fix the bottleneck by raising the hard-coded value to 255. We would propose that this be changed to a configuration variable, which can default to the old value of 5.
I have added the "listen backlog" global daemon parameter to the rsyncd.conf file (defaults to 5).
Something else you can try is to use xinetd as the listener that spawns rsyncd instances instead of having rsync --daemon do that.