Bug 9882 - Incorrect exit code when sender over SSH is killed with SIGTERM
Summary: Incorrect exit code when sender over SSH is killed with SIGTERM
Status: RESOLVED FIXED
Alias: None
Product: rsync
Classification: Unclassified
Component: core (show other bugs)
Version: 3.1.0
Hardware: All Linux
: P5 normal (vote)
Target Milestone: ---
Assignee: Wayne Davison
QA Contact: Rsync QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-05-14 08:57 UTC by Ivan Zahariev
Modified: 2013-05-26 23:19 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ivan Zahariev 2013-05-14 08:57:53 UTC
For the tests I've compiled "rsync" from the latest GIT repository sources.
This happens only when SRC is an SSH remote location. Example:

famzah@vbox:~$ rsync famzah@127.0.0.1:/boot/initrd.img-3.2.0-36-generic-pae /tmp/test ; echo "Exit code: $?"
rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at rsync.c(610) [sender=3.1.0dev]
Exit code: 0

Here is how the process list looks like:

famzah@vbox:~$ ps xfww -o pid,user,tty,command
  PID USER     TT       COMMAND
 5076 famzah   ?        sshd: famzah@notty  
 5077 famzah   ?         \_ rsync --server --sender -e.Lsf . /boot/initrd.img-3.2.0-36-generic-pae
 ...
 2996 famzah   pts/1     \_ /bin/bash
 4960 famzah   pts/1     |   \_ rsync famzah@127.0.0.1:/boot/initrd.img-3.2.0-36-generic-pae /tmp/test
 4961 famzah   pts/1     |       \_ ssh -l famzah 127.0.0.1 rsync --server --sender -e.Lsf . /boot/initrd.img-3.2.
 5078 famzah   pts/1     |       \_ rsync famzah@127.0.0.1:/boot/initrd.img-3.2.0-36-generic-pae /tmp/test

I have killed the "rsync" sender process at the _remote_ location with SIGTERM (PID=5077 in the example above).

Rsync properly indicates that the problem occurred in [sender=3.1.0dev] and also displays that exit code should be 20. But in the end, the rsync process terminates with exit code 0 which is incorrect (Success).

While trying this a few times, I noticed that sometimes "rsync" terminates with a proper exit code if the error occurs in the local "receiver" or "generator" processes. I get "error in rsync protocol data stream (code 12) at io.c" then and the final exit code of "rsync" is 12, as expected.

I also tried all other combinations:
  * sync between local SRC and local DST
  * sync between local SRC and remote DST over SSH
  * killing any of the "rsync" processes (receiver, generator, sender, or SSH tunnel)

The only problem I got with the incorrect final exit code is when I transfer with remote SRC over SSH and local DST, as demonstrated. Furthermore, the "sender" needs to be killed with SIGTERM, which seems to be propagated back to the "receiver" process by the rsync protocol. Killing the "sender" with SIGKILL causes immediate termination and therefore an rsync protocol error at the side of the "receiver", which exits properly with a non-zero exit code.
Comment 1 Wayne Davison 2013-05-26 23:19:46 UTC
The exit code gets passed from the sender to the receiver, but the generator doesn't get notified about it, so it exits with the wrong code.  I have checked in a fix for the issue into git.  Thanks for the report!