Ubuntu got a bug report (https://bugs.launchpad.net/ubuntu/+source/rsync/+bug/1774788) showing that rsync can fail to start during boot if a listen address is specified in the rsyncd.conf config file. Since the systemd service file does not have a dependency on network-online.target, rsync tries to bind to the specific IP address from the "address" option and fails if it's not available. A workaround is to add "After=network-online.target" to the systemd service file, but that is unnecessary for the common case where there is no specific address configuration. It's also a bit frowned upon in upstream systemd (see https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/). For such cases, it is suggested to use the IP_FREEBIND socket option in linux. From the linux ip(7) manpage: IP_FREEBIND (since Linux 2.4) If enabled, this boolean option allows binding to an IP address that is nonlocal or does not (yet) exist. This per‐ mits listening on a socket, without requiring the underlying network interface or the specified dynamic IP address to be up at the time that the application is trying to bind to it. This option is the per-socket equivalent of the ip_nonlo‐ cal_bind /proc interface described below.
If this is done, please make it optional. I want my daemon to break when given an invalid config, e.g. a typo'd IP address. The fact that systemd folks are crazy and people don't want to have proper startup dependencies isn't a good reason to make the rest of us suffer unconditionally.
I agree with Carson. If rsync is told to do the impossible it should fail with an appropriate error and exit code. Unfortunately I would also have to argue that the current behaviour is wrong because it does not exit with an error code and because I think the error message should also go to stderr instead of just syslog. A non-0 exit code would allow the launching script to simply keep trying until it works. I think even the systemd people would prefer an explicit failure over an exit 0 with no output. A program that fails to do what it was told to do shouldn't be effectively the same as /bin/true.
Thanks for all the opinions. I have one remaining issue, and that is with "systemctl start rsync.service" not detecting the failure right away. The systemd unit file calls rsync like this: [Service] ExecStart=/usr/bin/rsync --daemon --no-detach This is correct, specially the --no-detach option. systemd should be able to tell immediately if the service started or not, but according to the systemd.service manpage, it will signal success in any case for type=simple services. It's not much better for type=exec. If I run rsync with an invalid config, it exits non-zero right away: root@j1-rsyncd:~# rsync --daemon --no-detach ; echo $? 10 But via systemctl, it exits 0: root@j1-rsyncd:~# systemctl start rsync; echo $? 0 root@j1-rsyncd:~# systemctl status rsync × rsync.service - fast remote file copy program daemon Loaded: loaded (/lib/systemd/system/rsync.service; disabled; vendor preset: enabled) Active: failed (Result: exit-code) since Wed 2022-03-30 19:10:03 UTC; 3s ago Docs: man:rsync(1) man:rsyncd.conf(5) Process: 4305 ExecStart=/usr/bin/rsync --daemon --no-detach (code=exited, status=10) Main PID: 4305 (code=exited, status=10) CPU: 3ms Mar 30 19:10:03 j1-rsyncd rsyncd[4305]: bind() failed: Cannot assign requested address (address-family 2) Mar 30 19:10:03 j1-rsyncd systemd[1]: Started fast remote file copy program daemon. Mar 30 19:10:03 j1-rsyncd rsyncd[4305]: unable to bind any inbound sockets on port 873 Mar 30 19:10:03 j1-rsyncd systemd[1]: rsync.service: Main process exited, code=exited, status=10/n/a Mar 30 19:10:03 j1-rsyncd rsyncd[4305]: rsync error: error in socket IO (code 10) at socket.c(545) [Receiver=3.2.3] Mar 30 19:10:03 j1-rsyncd systemd[1]: rsync.service: Failed with result 'exit-code'. This seems to be the norm for this type of systemd service (type=simple, type=exec), and looks like the most reliable way to have systemctl start detect immediately if the service failed or not would be to implement systemd's notify[1] mechanism in rsync. Type=forking might be an alternative, but this timeout would have to be tuned: root@j1-rsyncd:~# time systemctl start rsync Job for rsync.service failed because a timeout was exceeded. See "systemctl status rsync.service" and "journalctl -xeu rsync.service" for details. real 1m30.246s With TimeoutStartSec=5 in the unit file: root@j1-rsyncd:~# time systemctl start rsync Job for rsync.service failed because a timeout was exceeded. See "systemctl status rsync.service" and "journalctl -xeu rsync.service" for details. real 0m5.287s 1. https://www.freedesktop.org/software/systemd/man/sd_notify.html
Since rsyncd exits with error code 10 ("Error in socket I/O") there are two possible ways to improve the systemd unit: [Service] ... RestartForceExitStatus=10 Or: [Service] ... Restart=on-failure Both should help with late showing IPv6 address due to DAD taking time.
The `Restart=on-failure` option was added in https://github.com/WayneD/rsync/commit/d41bb98c09bf0b999c4eee4e2125c7e5d0747ec4 This should paper over the problem of late showing IPv6 addresses due to DAD taking time.