On kernel.org, we have a upload system which consists of one server rsyncing to
another, using an rsync daemon with a very long, random password.
Since upgrading the server to an x86-64 machine running Fedora Core 3 and rsync
2.6.3 (also tried rsync 2.6.4) we have been getting the following errors. The
client machine is a i386 tested with rsync 2.5.8 from RHEL3 as well as rsync
2.6.3 and 2.6.4:
[Note: this message is from rsync-2.6.4-2 from Fedora Development.]
@ERROR: auth failed on module pubupload
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: error in rsync protocol data stream (code 12) at io.c(420)
There are a total of four upload modules, of which "pubupload" is the largest
one by several orders of magnitude. These errors *only* happen on the
"pubupload" module, and even then not consistently (fortunately.)
The command used on the client host is:
rsync -rlHtSpq --delete --timeout=3600 --exclude /mirrors/ --exclude lost+found/
--exclude '.#*' /pub/ firstname.lastname@example.org::pubupload/
Please contact me directly for a copy of the rsyncd.conf file, since I do not
want to post it publically.
My first thought was that perhaps the combined length of the password and the
challenge string might be 64 characters, which is an MD4 length that used to
have a problem in older rsync versions. However, since the password exchange
happens after we've negotiated a protocol_version, this should always be handled
in a compatible manner.
Here's what I would recommend: edit the code in authenticate.c to add some
fprintf(stderr, ...) calls to the auth_server() function that will mention what
data is being received and compared. If you output the "line" read from the
client after the read_line() call (it needs a newline):
fprintf(stderr, "%s\n", line);
That will contain the username, a space, and the MD4 hash of the challenge
string combined with the password from the client.
Then, output the pass2 variable after the generate_hash() call:
fprintf(stderr, "%s\n", pass2);
That value should match the MD4 hash from the "line" output. You'll need to
stop the daemon and run the freshly-compiled debug version using --no-detach to
see the messages on stderr:
./rsync --daemon --no-detach
That should help you to figure out where the failure is occurring in the
authorization code. You can feel free to email me with what you discover (or
summarize to this bug-report -- whatever you prefer).
I'm embarrassed to say this turns out to be due to user error. In particular, I
had a typo in the *username* -- not in the password -- in one of several places
in the script. Perhaps it might be a sensible idea to add the (failed) username
into the error/log message for authentication failures.
Serendipitously, you'll be glad to know that I just finished checking in some
changes to the authorization code that makes it log the reason for why the
authorization failed (e.g. unauthorized user, missing secret for user, password