enable supervision by the systemd watchdog, if building against libsystemd
The tight loop that kresd
hit in #271 (closed) didn't need to cause a permanent system hang -- if kresd had commited to checking in with a supervising watchdog regularly, then the tight loop would have caused it to miss a watchdog checkin, and the supervisor could have killed it and restarted it automatically.
systemd makes this process relatively painless. At startup, kresd would use sd_watchdog_enabled(3)
to verify whether it was expected to use the watchdog, and if so, how frequently it needs to check in. if enabled, then kresd would add a timer to its event loop that invokes sd_notify(0, "WATCHDOG=1")
. To enable the use of the watchdog, we'd set WatchdogSec=
in kresd.service
.
sd_watchdog_enabled(3)
and sd_notify(3)
and systemd.service(5)
for more details.
Ideally, of course, there would be no hangs. But if the goal is a robust service with minimum downtime, this kind of supervision can be a way to work around any future bugs that pop up.