Bug 13463 - Please consider using the IP_FREEBIND socket option
Summary: Please consider using the IP_FREEBIND socket option
Status: NEW
Alias: None
Product: rsync
Classification: Unclassified
Component: core (show other bugs)
Version: 3.1.3
Hardware: All All
: P5 enhancement (vote)
Target Milestone: ---
Assignee: Wayne Davison
QA Contact: Rsync QA Contact
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-06-04 21:45 UTC by Andreas Hasenack
Modified: 2022-04-11 16:35 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andreas Hasenack 2018-06-04 21:45:53 UTC
Ubuntu got a bug report (https://bugs.launchpad.net/ubuntu/+source/rsync/+bug/1774788) showing that rsync can fail to start during boot if a listen address is specified in the rsyncd.conf config file.

Since the systemd service file does not have a dependency on network-online.target, rsync tries to bind to the specific IP address from the "address" option and fails if it's not available.

A workaround is to add "After=network-online.target" to the systemd service file, but that is unnecessary for the common case where there is no specific address configuration. It's also a bit frowned upon in upstream systemd (see https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/).

For such cases, it is suggested to use the IP_FREEBIND socket option in linux. From the linux ip(7) manpage:
IP_FREEBIND (since Linux 2.4)
If enabled, this boolean option allows binding to an IP
address that is nonlocal or does not (yet) exist. This per‐
mits listening on a socket, without requiring the underlying
network interface or the specified dynamic IP address to be up
at the time that the application is trying to bind to it.
This option is the per-socket equivalent of the ip_nonlo‐
cal_bind /proc interface described below.
Comment 1 Carson Gaspar 2018-06-08 19:53:34 UTC
If this is done, please make it optional. I want my daemon to break when given an invalid config, e.g. a typo'd IP address. The fact that systemd folks are crazy and people don't want to have proper startup dependencies isn't a good reason to make the rest of us suffer unconditionally.
Comment 2 Kevin Korb 2018-06-12 20:52:24 UTC
I agree with Carson.  If rsync is told to do the impossible it should fail with an appropriate error and exit code.

Unfortunately I would also have to argue that the current behaviour is wrong because it does not exit with an error code and because I think the error message should also go to stderr instead of just syslog.  A non-0 exit code would allow the launching script to simply keep trying until it works.

I think even the systemd people would prefer an explicit failure over an exit 0 with no output.  A program that fails to do what it was told to do shouldn't be effectively the same as /bin/true.
Comment 3 Andreas Hasenack 2022-03-30 20:06:48 UTC
Thanks for all the opinions. I have one remaining issue, and that is with "systemctl start rsync.service" not detecting the failure right away.

The systemd unit file calls rsync like this:

[Service]
ExecStart=/usr/bin/rsync --daemon --no-detach

This is correct, specially the --no-detach option. systemd should be able to tell immediately if the service started or not, but according to the systemd.service manpage, it will signal success in any case for type=simple services. It's not much better for type=exec.

If I run rsync with an invalid config, it exits non-zero right away:

root@j1-rsyncd:~# rsync --daemon --no-detach ; echo $?
10


But via systemctl, it exits 0:
root@j1-rsyncd:~# systemctl start rsync; echo $?
0

root@j1-rsyncd:~# systemctl status rsync
× rsync.service - fast remote file copy program daemon
     Loaded: loaded (/lib/systemd/system/rsync.service; disabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Wed 2022-03-30 19:10:03 UTC; 3s ago
       Docs: man:rsync(1)
             man:rsyncd.conf(5)
    Process: 4305 ExecStart=/usr/bin/rsync --daemon --no-detach (code=exited, status=10)
   Main PID: 4305 (code=exited, status=10)
        CPU: 3ms

Mar 30 19:10:03 j1-rsyncd rsyncd[4305]: bind() failed: Cannot assign requested address (address-family 2)
Mar 30 19:10:03 j1-rsyncd systemd[1]: Started fast remote file copy program daemon.
Mar 30 19:10:03 j1-rsyncd rsyncd[4305]: unable to bind any inbound sockets on port 873
Mar 30 19:10:03 j1-rsyncd systemd[1]: rsync.service: Main process exited, code=exited, status=10/n/a
Mar 30 19:10:03 j1-rsyncd rsyncd[4305]: rsync error: error in socket IO (code 10) at socket.c(545) [Receiver=3.2.3]
Mar 30 19:10:03 j1-rsyncd systemd[1]: rsync.service: Failed with result 'exit-code'.


This seems to be the norm for this type of systemd service (type=simple, type=exec), and looks like the most reliable way to have systemctl start detect immediately if the service failed or not would be to implement systemd's notify[1] mechanism in rsync.

Type=forking might be an alternative, but this timeout would have to be tuned:

root@j1-rsyncd:~# time systemctl start rsync
Job for rsync.service failed because a timeout was exceeded.
See "systemctl status rsync.service" and "journalctl -xeu rsync.service" for details.

real    1m30.246s


With TimeoutStartSec=5 in the unit file:

root@j1-rsyncd:~# time systemctl start rsync
Job for rsync.service failed because a timeout was exceeded.
See "systemctl status rsync.service" and "journalctl -xeu rsync.service" for details.

real    0m5.287s




1. https://www.freedesktop.org/software/systemd/man/sd_notify.html
Comment 4 Simon Deziel 2022-03-30 20:26:46 UTC
Since rsyncd exits with error code 10 ("Error in socket I/O") there are two possible ways to improve the systemd unit:

[Service]
...
RestartForceExitStatus=10

Or:

[Service]
...
Restart=on-failure


Both should help with late showing IPv6 address due to DAD taking time.
Comment 5 Simon Deziel 2022-04-11 16:35:37 UTC
The `Restart=on-failure` option was added in https://github.com/WayneD/rsync/commit/d41bb98c09bf0b999c4eee4e2125c7e5d0747ec4

This should paper over the problem of late showing IPv6 addresses due to DAD taking time.