Add VRF support
I try to run knot-resolver in a vrf.
I want the /metric endpoint to be accessible only internally but resolving DoH/DoT over a vrf-interface which is meant to be for external requests.
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Link issues together to show that they're related. Learn more.
Activity
- Owner
I would probably do these splits by running the "internal" services on addresses that only get routed internally. (could be localhost or anything) DoT always uses separately specified address, and DoH provided by the
http
module is also separable from the rest. More details is in docs: https://knot-resolver.readthedocs.io/en/stable/daemon.html#network-configurationEdited by Vladimír Čunát I was able to implement ingoing VRF support by configuring a systemd socket like this:
# cat /etc/systemd/system/kresd.socket.d/override.conf [Socket] BindToDevice=vrf_external ListenDatagram= ListenStream= ListenDatagram=1.2.3.4:53 ListenStream=1.2.3.4:53
But: How do I implement outgoing queries via vrf? When I configure "net.outgoing_v4" it is going via a "normal" interface - not via the vrf
Edited by krombelOne hint: PowerDNS is at the moment implementing VRF support as well. You can probably get some inspiration from their implementation: https://github.com/PowerDNS/pdns/pull/8372
- Owner
For listening I can imagine adding an option to cause
SO_BINDTODEVICE
, but that's apparently easy via systemd anyway, so there seems little motivation left. For the direction towards upstream servers this will probably be difficult to do in our implementation. I agree, that listening support is possible via systemd. I pasted an example on how to do that.
Our issue now is that the requests are going out on the wrong interface. We separated the external net from the internal one and resolver should send the requests only on the correct interface.
DDoS rules setup in the datacenter started blocking things as the requests are in the wrong net and it seems to be hard on their side to fix it. So it would be really helpful to get that implemented.
We would offer testing you want us to. We are doing the same for dnsdist (as we run it to loadbalance between several hosts which run knot-resolver). So we have the architecture setup for that.
- Contributor
Can you please clarify your use-case? We do not have experience with VRF in our team so we are a bit blind.
- Petr Špaček added needinfo label
added needinfo label
You might want to have a look at another source for "Why VRF support?" on the raspberrypi issue-tracker: https://github.com/raspberrypi/linux/issues/3253#issuecomment-534235017
Our special use-case is the following: We have two networks: Internal and external. The external network is implemented by using VRFs so we are able to have dedicated routing tables for the networks.
As the instance of knot-resolver should handle external requests it is required to have it listen to vrf_external directly (implementable through
BindToDevice=vrf_external
in systemd). Now we realized that as we had it running on "normal" != vrf-network all outgoing requests were going through our internal net.We tried to use
net.outgoing
to make it use the vrf but that was not working. It could not bind on that IP as it is only available viavrf_external
.So in short we are asking for vrf-support on outgoing requests.
- Owner
For your particular case I'd expect you can switch the default (via
ip vrf
?) to external for kresd and use systemd's socket to switch webmgmt to internal – and possibly add more internal listening addresses... am I missing something? (I'm not trying to claim it would be a nice solution.) Yes that is true besides on issue we realized. We configured module prefill like this
modules.load('prefill') prefill.config({ ['.'] = { url = 'https://www.internic.net/domain/root.zone', ca_file = '/etc/ssl/certs/ca-certificates.crt', interval = 86400 -- seconds } })
and have
# cat /etc/resolv.conf search in.ffmuc.net nameserver 10.80.255.12
You might guess it: That IP is only reachable via internal network.
=>
[prefill] cannot download new zone (/usr/lib/knot-resolver/kres_modules/prefill.lua:88: [prefill] fetch of `https://www.internic.net/domain/root.zone` failed: temporary failure in name resolution), will retry root zone download in 09 minutes 53 seconds
This in itself is probably no issue but the default systemd-file for kresd@.service has
WatchdogSec=10
=> SystemD restarts kresd every 10 seconds while the timeout is definitely longer than 10s
=> We get a boot loop and no running knot-resolver
- Owner
Yes, the blocking nature of
prefill
is a known bug #512 (closed). - Owner
I think this feature would have to be supported by our
net.listen()
function. I imagine the implementation could be similar to freebind. I'd welcome and review merge requests that add this feature. I don't have any plans to implement this myself and the priorities are probably similar for the rest of the team. - Tomas Krizek added feature label and removed needinfo label
- Owner
Yes. In terms of workarounds, the http module was tested to work on AF_UNIX sockets as well, so you might be able to utilize that. I suspect it's even possible to use
ip vrf exec
withsocat
to "bridge" that local socket to another VRF. https://gitlab.labs.nic.cz/knot/knot-resolver/merge_requests/811