Simplify use of Linux capabilities?
Hi,
I've reviewed Knot DNS's use of Linux capabilities, and I found it somewhat confusing. I'll make a suggestion to simplify this code in a follow-up comment.
In the configuration reference, the only mention of capabilities is in the description of the server: user
config variable:
A system user with an optional system group (user:group) under which the server is run after starting and binding to interfaces. Linux capabilities are employed if supported.
This sort of implies on first glance that capabilities are only used if server: user
is set, but actually capabilities are used for much more than changing user/group and knotd makes calls to change capabilities regardless of whether server: user
is set in the config.
In the README, the libcap-ng
library is mentioned, but only as an optional dependency. Capabilities are only used if this dependency is present at build time, otherwise Knot behaves like a traditional uid/gid privilege dropping daemon. When libcap-ng
support is compiled in, here is how knotd behaves:
-
First, knotd needs to be started as root with traditional privileges, or as a non-root user that has had its capabilities elevated (e.g. with systemd's
User=
,Group=
,CapabilityBoundingSet=
andAmbientCapabilities=
options). -
When knotd is started, the main thread calls
setup_capabilities()
at an early part of the daemon startup, in particular before any logging is setup, configuration is loaded, sockets are created, etc. This function has a comment that it should "Drop all capabilities" but in fact it doesn't. Instead, if the process has theCAP_SETPCAP
capability, it drops most of its capabilities, but retains the following:
CAP_SETPCAP
CAP_DAC_OVERRIDE
CAP_CHOWN
CAP_NET_BIND_SERVICE
CAP_SETUID
CAP_SETGID
CAP_SYS_NICE
This is still quite a lot of privileges that traditionally only the root user should possess, so my assumption is that this code was written assuming that knotd was being started as the unconstrained root user (thus knotd is giving up a bunch of capabilities that it doesn't need), rather than as a non-root user that has been given elevated capabilities (because a sysadmin or packager setting up a constrained environment for knotd would realize that a non-root process with these capabilities is a quite privileged process and wonder what is going on).
- Once knotd has given up some of its capabilities but retained those listed above, it proceeds with the rest of the daemon startup. In particular, sockets are bound, uid/gid privileges are dropped, and threads are started.
The uid/gid privilege dropping in proc_update_privileges()
appears to only be performed if server: user
is set in the config to a different user/group than the one that starts the knotd process. (That's another thing that makes me think this code assumes a root → non-root transition; I can't think of a daemon that does setuid/setgid from a non-root user with the CAP_SETUID
+ CAP_SETGID
capabilities.)
- Assuming that knotd changed uid/gid to a non-root user, it should have now lost its remaining capabilities automatically. See "Effect of user ID changes on capabilities" in the capabilities(7) manpage:
If one or more of the real, effective or saved set user IDs was previously 0, and as a result of the UID changes all of these IDs have a nonzero value, then all capabilities are cleared from the permitted and effective capability sets.
If the effective user ID is changed from 0 to nonzero, then all capabilities are cleared from the effective set.
Here are the confusing parts, though.
- In the thread entry point function
thread_ep()
which is called when a new thread is created, the following code is executed:
/* Drop capabilities except FS access. */
#ifdef HAVE_CAP_NG_H
if (capng_have_capability(CAPNG_EFFECTIVE, CAP_SETPCAP)) {
capng_type_t tp = CAPNG_EFFECTIVE|CAPNG_PERMITTED;
capng_clear(CAPNG_SELECT_BOTH);
capng_update(CAPNG_ADD, tp, CAP_DAC_OVERRIDE);
capng_apply(CAPNG_SELECT_BOTH);
}
#endif /* HAVE_CAP_NG_H */
This seems weird, because we should have lost the CAP_SETPCAP
capability if we transitioned from root to a non-root user. That is, I think this code only runs if knotd was started as a non-root user with at least the CAP_SETPCAP
privilege. But that seems unlikely (see parenthetical remarks above). It's also not clear to me why the non-main threads started by knotd need to retain the CAP_DAC_OVERRIDE
capability, which is traditionally a root privilege. (I would think that knotd should set up any permissions it needs to operate correctly before dropping privileges.)
- In the function
udp_master()
which is called for UDP worker threads, the following code is executed:
/* Drop all capabilities on all workers. */
#ifdef HAVE_CAP_NG_H
if (capng_have_capability(CAPNG_EFFECTIVE, CAP_SETPCAP)) {
capng_clear(CAPNG_SELECT_BOTH);
capng_apply(CAPNG_SELECT_BOTH);
}
#endif /* HAVE_CAP_NG_H */
This would drop all capabilities according to the example given on https://people.redhat.com/sgrubb/libcap-ng/, but it only occurs if we have the CAP_SETPCAP
capability, which we would have already lost (along with all other capabilities) if knotd dropped from root to non-root. So this code seems redundant, or at least it will only execute if knotd was started as a non-root user with the CAP_SETPCAP
privilege.
- The function
tcp_master()
doesn't have any calls to libcap-ng, but it does have this suggestive, unused#include
:
#ifdef HAVE_CAP_NG_H
#include <cap-ng.h>
#endif /* HAVE_CAP_NG_H */
- It's not clear to me why
setup_capabilities()
wants to retain theCAP_SYS_NICE
capability. This capability allows changing niceness, scheduling/affinity properties, etc. for arbitrary processes. I see some calls topthread_setaffinity_np()
and a commented out call topthread_attr_setschedpolicy()
in the code base, but it seems surprising thatCAP_SYS_NICE
would actually be required.
pthread_setaffinity_np()
is implemented with sched_setaffinity(2), and according to its manpage CAP_SYS_NICE
is only required if the caller and the target thread are running as different users:
The caller needs an effective user ID equal to the real user ID or effective user ID of the thread identified by pid, or it must possess the CAP_SYS_NICE capability in the user namespace of the thread pid.