On kernel.org, we have a upload system which consists of one server rsyncing to another, using an rsync daemon with a very long, random password. Since upgrading the server to an x86-64 machine running Fedora Core 3 and rsync 2.6.3 (also tried rsync 2.6.4) we have been getting the following errors. The client machine is a i386 tested with rsync 2.5.8 from RHEL3 as well as rsync 2.6.3 and 2.6.4: [Note: this message is from rsync-2.6.4-2 from Fedora Development.] @ERROR: auth failed on module pubupload rsync: connection unexpectedly closed (0 bytes received so far) [sender] rsync error: error in rsync protocol data stream (code 12) at io.c(420) There are a total of four upload modules, of which "pubupload" is the largest one by several orders of magnitude. These errors *only* happen on the "pubupload" module, and even then not consistently (fortunately.) The command used on the client host is: rsync -rlHtSpq --delete --timeout=3600 --exclude /mirrors/ --exclude lost+found/ --exclude '.#*' /pub/ korg@zeus2.kernel.org::pubupload/ Please contact me directly for a copy of the rsyncd.conf file, since I do not want to post it publically.
My first thought was that perhaps the combined length of the password and the challenge string might be 64 characters, which is an MD4 length that used to have a problem in older rsync versions. However, since the password exchange happens after we've negotiated a protocol_version, this should always be handled in a compatible manner. Here's what I would recommend: edit the code in authenticate.c to add some fprintf(stderr, ...) calls to the auth_server() function that will mention what data is being received and compared. If you output the "line" read from the client after the read_line() call (it needs a newline): fprintf(stderr, "%s\n", line); That will contain the username, a space, and the MD4 hash of the challenge string combined with the password from the client. Then, output the pass2 variable after the generate_hash() call: fprintf(stderr, "%s\n", pass2); That value should match the MD4 hash from the "line" output. You'll need to stop the daemon and run the freshly-compiled debug version using --no-detach to see the messages on stderr: ./rsync --daemon --no-detach That should help you to figure out where the failure is occurring in the authorization code. You can feel free to email me with what you discover (or summarize to this bug-report -- whatever you prefer).
I'm embarrassed to say this turns out to be due to user error. In particular, I had a typo in the *username* -- not in the password -- in one of several places in the script. Perhaps it might be a sensible idea to add the (failed) username into the error/log message for authentication failures.
Serendipitously, you'll be glad to know that I just finished checking in some changes to the authorization code that makes it log the reason for why the authorization failed (e.g. unauthorized user, missing secret for user, password mismatch).