Knot Resolver issueshttps://gitlab.nic.cz/knot/knot-resolver/-/issues2024-02-28T12:12:23+01:00https://gitlab.nic.cz/knot/knot-resolver/-/issues/417support prefilling for arbitrary zone2024-02-28T12:12:23+01:00Petr Špačeksupport prefilling for arbitrary zoneUlrich from IIS requested feature which would allow them to prefill resolver's cache with arbitrary zone, i.e. not only root zone.
Technical note:
Simple removal of checks for zone name does not work because `DS` records are missing in ...Ulrich from IIS requested feature which would allow them to prefill resolver's cache with arbitrary zone, i.e. not only root zone.
Technical note:
Simple removal of checks for zone name does not work because `DS` records are missing in cache and this lead to failing validation. Maybe we can just wrap import in a function which requests `DS` and calls import from query callback?https://gitlab.nic.cz/knot/knot-resolver/-/issues/392improve protection from NTP attacks2018-08-06T11:38:00+02:00Petr Špačekimprove protection from NTP attacksMaybe we can tune some parameters introduced in !392 to be more resilient. This needs more thought.
Sources:
* https://nlnetlabs.nl/downloads/presentations/The-impact-of-NTP-security-weaknesses-on-DNSSEC.pdf
* https://tools.ietf.org/htm...Maybe we can tune some parameters introduced in !392 to be more resilient. This needs more thought.
Sources:
* https://nlnetlabs.nl/downloads/presentations/The-impact-of-NTP-security-weaknesses-on-DNSSEC.pdf
* https://tools.ietf.org/html/draft-aanchal-time-implementation-guidance-00https://gitlab.nic.cz/knot/knot-resolver/-/issues/314forwarding policy should be able to specify EDNS0 Client Subnet2019-12-18T19:15:02+01:00Daniel Kahn Gillmorforwarding policy should be able to specify EDNS0 Client SubnetThe [EDNS0 Client Subnet extension](https://tools.ietf.org/html/rfc7871#section-7.1.2) describes a way that a "stub resolver" can specify its preferred limit of how much the resolver should reveal to the authoritative about the client's ...The [EDNS0 Client Subnet extension](https://tools.ietf.org/html/rfc7871#section-7.1.2) describes a way that a "stub resolver" can specify its preferred limit of how much the resolver should reveal to the authoritative about the client's IP address.
A user may have a configured resolver that they trust enough to forward to, but not want the resolver to leak their IP address to the authoritative servers it looks up. If such a user is running `kresd` as a local caching stub with a forwarding policy, they might want to configure something like:
policy.FORWARD({'192.0.2.15', ecs_prefix_len=0})https://gitlab.nic.cz/knot/knot-resolver/-/issues/295validator might better ignore out-of-bailiwick crap2018-01-22T15:27:22+01:00Vladimír Čunátvladimir.cunat@nic.czvalidator might better ignore out-of-bailiwick crapReal-life example: `www.vikhockey.se. AAAA` fails in validator, due to server returning:
```
kdig @195.74.39.30 www.vikhockey.se. AAAA +dnssec
;; ->>HEADER<<- opcode: QUERY; status: NXDOMAIN; id: 50218
;; Flags: qr aa rd; QUERY: 1; ANSWE...Real-life example: `www.vikhockey.se. AAAA` fails in validator, due to server returning:
```
kdig @195.74.39.30 www.vikhockey.se. AAAA +dnssec
;; ->>HEADER<<- opcode: QUERY; status: NXDOMAIN; id: 50218
;; Flags: qr aa rd; QUERY: 1; ANSWER: 2; AUTHORITY: 8; ADDITIONAL: 1
;; EDNS PSEUDOSECTION:
;; Version: 0; flags: do; UDP size: 1680 B; ext-rcode: NOERROR
;; QUESTION SECTION:
;; www.vikhockey.se. IN AAAA
;; ANSWER SECTION:
www.vikhockey.se. 600 IN CNAME vvik1-vvik.ramses.nu.
www.vikhockey.se. 600 IN RRSIG CNAME 8 3 600 20180201000000 20180111000000 34296 vikhockey.se. mnn7gL0v3BupFGZi4N/CV6vINkNOFy2y4H0Vx0ukrYDScxCubeLA0YCYCIE3thu13DCkOFuijUbWtaA9KSMivfJUb1q5yX0jdT0b5nvwK1/YSk2YnXMEbrjWqTu4rig+KsrZ0XSb76E0d/9wN5VtFxNkhfZypu5HSj85Isy46Bw=
;; AUTHORITY SECTION:
ramses.nu. 3600 IN SOA ns3.binero.se. registry.binero.se. 1516233600 86400 5400 604800 3600
ramses.nu. 3600 IN RRSIG SOA 8 2 3600 20180201000000 20180111000000 34296 ramses.nu. g4KxoD6HuieeEBgG6Z6oUTlhwdGelcUWRUq3Jd9osVaFzvn8XscQDdmcGh4maK0yofoz8t/ShRVjC4XQGnj5//eejMXY1jgra39VMbJ9P+7JOvGUuETw0WJL8oT7YehfFkCv1CRL5IoM6d9SYdYkmcDt/aoDMeoG+WgEZ6QHW5Y=
v8ssphenr3p30k9a4dpae5pr9ib7m3l1.ramses.nu. 3600 IN NSEC3 1 1 1 AB 18AJT6FFNC06017DT70ELSCVH3763P1C NS SOA MX RRSIG DNSKEY NSEC3PARAM
v8ssphenr3p30k9a4dpae5pr9ib7m3l1.ramses.nu. 3600 IN RRSIG NSEC3 8 3 3600 20180201000000 20180111000000 34296 ramses.nu. wSFv8izGquRzjaZJSnXn+7hgpaqfKGEr3l5OwtEI0KlBRPFmXGv8RD1d9dhJqp1QeaDK67rZqzFHioA/p13RP7kYDUCiOHX8VoA9hbQr3nFHeerkt+zSiYNaAH43sWT7oHpnrN9ODUIIB0s4Tbm1+U2G7tJ90JyjCjmMEXu+UQQ=
3dnbf1prkcm9234cr9atsv8a2gfs2oua.ramses.nu. 3600 IN NSEC3 1 1 1 AB 71O8H4PM96IP6HK4FDMQ2G34KD9KKGV4 A RRSIG
3dnbf1prkcm9234cr9atsv8a2gfs2oua.ramses.nu. 3600 IN RRSIG NSEC3 8 3 3600 20180201000000 20180111000000 34296 ramses.nu. dFKDMKzdwDmNEFfItTlEIIhAqqbk13WEO/etgywJLzEt3PRW1s70jfFCWqTeOjAUdeF6JEfLWklPYkhpBe0UwmYEVqlQcYJ37AKX7gUyN/iBKTtMfQWTXfdHMyjj1fyfEoeFh2SMk1Vl5bys1HKajB0SkOnKmzDKnZjBftDuimE=
j8qedtq6ned9n5sl7e99incs8s1m29sb.ramses.nu. 3600 IN NSEC3 1 1 1 AB MUE5EI8JM7A860A6HCDO7LQ42OSF6V55 A RRSIG
j8qedtq6ned9n5sl7e99incs8s1m29sb.ramses.nu. 3600 IN RRSIG NSEC3 8 3 3600 20180201000000 20180111000000 34296 ramses.nu. HTN4XXRy53RX8p2wksZ5HwW8gYisHHCWwbD/yjiUc4CC+q2tc9jiX9NTriGuKd32BCKqceHlPrAeU62Bn1fujCCKvmctVavr0oUXw4XSl0sJblyH5FitapCBwSW2rmiFY53Jup8oUQLpuNeNP8euADbai//gUiBl9UwHR0qR65c=
;; Received 1224 B
;; Time 2018-01-19 13:42:17 CET
;; From 195.74.39.30@53(UDP) in 130.5 ms
```
The part about CNAME is OK, but the NXDOMAIN on the target is BOGUS. (Seems like outdated `ramses.nu.` zone remaining on the server.)https://gitlab.nic.cz/knot/knot-resolver/-/issues/244track network changes and reconfigure as validating stub / resolver automatic...2021-06-24T13:38:33+02:00Petr Špačektrack network changes and reconfigure as validating stub / resolver automaticallyTaken from https://github.com/CZ-NIC/knot-resolver/issues/7
There should be a module to track changes in the network and environment to detect when the resolver is in an:
- Environment that blocks DNS queries altogether (and revert to s...Taken from https://github.com/CZ-NIC/knot-resolver/issues/7
There should be a module to track changes in the network and environment to detect when the resolver is in an:
- Environment that blocks DNS queries altogether (and revert to stub mode)
- Environment with DNSSEC-unaware resolver (do validation)
- Open environment (full recursive resolver)
This would make it as painless as possible for the end users with frequent network transitions (hotel wifi, workplace, home, ...)
Fallback to https://github.com/fcambus/rrda if the DNS is filtered/unreachable.https://gitlab.nic.cz/knot/knot-resolver/-/issues/207workarounds: log using generic workarounds2020-02-03T16:20:36+01:00Vladimír Čunátvladimir.cunat@nic.czworkarounds: log using generic workaroundsWhen using generic workarounds, it would be nice to have a possibility to log them, so the protocol violations might be collected and reported e.g. at https://github.com/dns-violations/, at the server operator, etc. ([Suggested by Anand...When using generic workarounds, it would be nice to have a possibility to log them, so the protocol violations might be collected and reported e.g. at https://github.com/dns-violations/, at the server operator, etc. ([Suggested by Anand.](https://ripe74.ripe.net/archives/video/159/))
It's probably of no use for the specific cases in the workarounds module, as those are known.https://gitlab.nic.cz/knot/knot-resolver/-/issues/195support RPZ from zone transfer2023-07-12T16:52:22+02:00Petr Špačeksupport RPZ from zone transferIt is painful to retrieve RPZ data as files. IXFR would operationaly help a lot.It is painful to retrieve RPZ data as files. IXFR would operationaly help a lot.https://gitlab.nic.cz/knot/knot-resolver/-/issues/891dnstap: resiliency against socket failures2024-02-06T12:27:38+01:00Oto Šťávadnstap: resiliency against socket failuresKnot Resolver is not very consistent when it comes to handling dnstap failures. In my testing, when trying to connect to a non-existent or "inactive" (i.e. there is nobody listening on the other side of the socket) dnstap socket, Knot Re...Knot Resolver is not very consistent when it comes to handling dnstap failures. In my testing, when trying to connect to a non-existent or "inactive" (i.e. there is nobody listening on the other side of the socket) dnstap socket, Knot Resolver only logs a connection failure, and then keeps on working as normal. However, when there is something listening on the other side, and it does not actually understand dnstap, Knot Resolver fails to start.
When testing with [go-dnscollector](https://github.com/dmachard/go-dnscollector), I have also found that there is a slight problem when the consumer is restarted - dnstap starts working again eventually, but each worker *only reconnects* to the socket when "nudged" with a DNS query, but does not seem to push the events of said query to the socket. Only the second query (and other subsequent queries until the consumer is stopped) sent to the worker gets pushed to dnstap.https://gitlab.nic.cz/knot/knot-resolver/-/issues/637cache: sharing across containers requires special options2022-11-18T16:56:08+01:00Petr Špačekcache: sharing across containers requires special optionsVersion: 5.1.3 originally but any version really
Error
=====
```
[cache] LMDB error: Resource temporarily unavailable
[cache] LMDB error: Resource temporarily unavailable
[cache] incompatible cache database detected, purging
[cache] rea...Version: 5.1.3 originally but any version really
Error
=====
```
[cache] LMDB error: Resource temporarily unavailable
[cache] LMDB error: Resource temporarily unavailable
[cache] incompatible cache database detected, purging
[cache] reading version returned: -11
[system] interactive mode
[00000.00][plan] plan '.' type 'NS' uid [65536.00]
[65536.00][iter] '.' type 'NS' new uid was assigned .01, parent uid .00
[cache] LMDB error: Resource temporarily unavailable
[65536.01][cach] => exact hit error: -11 Resource temporarily unavailable
```
Reproducer
==========
Attempt to share cache across two or more Docker containers:
```
docker run -P -w /tmp/kresd -v /tmp/shared:/tmp/kresd -ti cznic/knot-resolver:v5.1.3
```
Minimal reproducer without Docker: Run two processes using command
```
unshare -Up --fork kresd
```
Root cause
==========
This is caused by LMDB dependency on unique PID numbers (for reader slots?). This assumption does not hold for Docker containers (because of its use of PID namespaces). LMDB upstream [does not seem to care](https://lists.openldap.org/hyperkitty/list/openldap-technical@openldap.org/thread/TL4XPCHRRGBV6SWBQIARC6E5XZNJ4SDX/).
Workaround
==========
Disable PID namespace, i.e. run Docker containers using `docker run --pid=host`, which prevents non-unique PIDs among containers.
Alternative is to run additional containers with the same PID namespace as the first container using `docker run --pid=container:name_of_the_first_container`, but disadvantage is that exiting the first container will terminate all others as well. I.e. this prevents dynamic instance restarts.https://gitlab.nic.cz/knot/knot-resolver/-/issues/623declarative config - Lua API extension2020-11-25T13:22:36+01:00Vaclav Sraierdeclarative config - Lua API extensionI would like to open a discussion as a follow up after #536. The problem remains and this proposal attempts to fix it differently.
# Problem (re)statement
Current configuration is practically a Lua program, which is a nightmare for mul...I would like to open a discussion as a follow up after #536. The problem remains and this proposal attempts to fix it differently.
# Problem (re)statement
Current configuration is practically a Lua program, which is a nightmare for multiple reasons:
* non-programmers have hard time understanding what is going on
* Lua language makes it hard to detect mistakes in the config
* run-time reconfiguration requires doing each change N times for N processes
* currently it exposes low-level stuff and it prone to crashes on invalid use (#182)
# Proposal
## kresd
We could extend kresd API with the following function:
```lua
--- Sets the resolver to supplied state regardless of what was configured
--- before. Options that aren't specified in the argument are set to their
--- default value
---
--- @param cfg Table corresponding to the existing YANG model
function configure(cfg)
```
And optionally with this:
```lua
--- Returns a table corresponding to the existing YANG model with the current
--- configuration.
function dump_configuration()
```
### Motivation
* extends existing API, this change will not break any existing setup
* works with simple data formats so it is quite feasible to implement the whole functionality in pure Lua
* Updates of policies or other large data might be performed by the existing API, side-stepping the new configuration functions, alleviating performance issues with the declarative API.
* relatively simple to implement
* new file configuration format might be easily added later on allowing direct declarative configuration
* implements foundation for dynamically reloadable configuration - adding it on top of the declarative configuration (in the previous bullet point) would be quite straightforward
### Known issues
* At least some validation of the data format must be present in every kresd instance. By exposing these functions publicly, there is no way to go around that. An option might be to make something very similar but private. Then a centralized configuration tool (see bellow) could do the validation eliminating the need for validation by every instance.
### To be considered
* Is it really a good idea to use Lua tables as the configuration format? Lua is not backward compatible between releases which might lead to potential problems. Using JSON instead might be more future proof and it might integrate better with existing tools.
* Do we really want to stick to the existing Lua API? Wouldn't it be better to implement something completely new allowing us to ditch the existing API at some point in the future?
## Centralized management of multiple instances
To enable centralized management of multiple instances, a separate tool can be developed utilizing both new functions described above. It could provide any type of external API (NETCONF, REST API, sysrepo, different centralized configuration file...) and bridge it to our two new functions, calling them for all resolver instances as necessary. We could even implement this in a form of a library for commanding all kresd instances on the system at once, leaving the external API implementation up to interested parties in their specific technologies.
Basics of this were already written by @amrazek in the form of the `kres-watcher` tool.https://gitlab.nic.cz/knot/knot-resolver/-/issues/615disallow mixing protocols in net.listen()2022-02-16T07:24:37+01:00Tomas Krizekdisallow mixing protocols in net.listen()Due to our reuseport facility, it is possible to use `net.listen()` to bind multiple protocols to a single (ip, port) combination. I can't think of any valid use-case and the most likely cause - typo - will cause misbehavior instead of a...Due to our reuseport facility, it is possible to use `net.listen()` to bind multiple protocols to a single (ip, port) combination. I can't think of any valid use-case and the most likely cause - typo - will cause misbehavior instead of a crash.
```
-- this isn't valid or supported
net.listen('::1', 443, { kind = 'tls' })
net.listen('::1', 443, { kind = 'doh2' })
```
I think the resolver should crash in these cases.https://gitlab.nic.cz/knot/knot-resolver/-/issues/590document bug reporting procedure2020-07-10T14:10:23+02:00Petr Špačekdocument bug reporting procedure- test on latest version
- mention relevant system information
- how to capture GDB traceback
- how to limit logging to problematic names
- how to capture network traffic + keys (TLS, DoH)
...- test on latest version
- mention relevant system information
- how to capture GDB traceback
- how to limit logging to problematic names
- how to capture network traffic + keys (TLS, DoH)
...https://gitlab.nic.cz/knot/knot-resolver/-/issues/589document threat model2020-07-11T22:10:59+02:00Petr Špačekdocument threat model- inputs
- trusted (config, control socket, cache, files on disk)
- untrusted (network traffic)
- decide: prefill? hints? ...
- DoS is always possible (network overload, hijack etc.)
- integrity - DNSSEC
- confidentiality - do not ...- inputs
- trusted (config, control socket, cache, files on disk)
- untrusted (network traffic)
- decide: prefill? hints? ...
- DoS is always possible (network overload, hijack etc.)
- integrity - DNSSEC
- confidentiality - do not count on it, encrypting only DNS traffic does not hide ithttps://gitlab.nic.cz/knot/knot-resolver/-/issues/583new statistics for encrypted transports2020-06-19T14:17:50+02:00Petr Špačeknew statistics for encrypted transportsIt would be interesting to see statistics for:
- [ ] number of TLS handshakes
- [ ] TLS versions
- [ ] HTTP versions
- [ ] HTTP request methods
- [ ] HTTP status codes
Question: Are these stats sufficient to gather details about connect...It would be interesting to see statistics for:
- [ ] number of TLS handshakes
- [ ] TLS versions
- [ ] HTTP versions
- [ ] HTTP request methods
- [ ] HTTP status codes
Question: Are these stats sufficient to gather details about connection reuse?https://gitlab.nic.cz/knot/knot-resolver/-/issues/429negative trust anchor does not prevent NXDOMAIN from aggressive cache2020-04-06T09:52:56+02:00Petr Špačeknegative trust anchor does not prevent NXDOMAIN from aggressive cacheRight now aggressive cache masks "grafted" domains, e.g. fake TLDs, even if these are listed as negative trust anchors.
This is unexpected behavior and forces users to use `NO_CACHE` which is not optimal. In future we should exempt NTAs...Right now aggressive cache masks "grafted" domains, e.g. fake TLDs, even if these are listed as negative trust anchors.
This is unexpected behavior and forces users to use `NO_CACHE` which is not optimal. In future we should exempt NTAs from aggressive cache.https://gitlab.nic.cz/knot/knot-resolver/-/issues/364policy and statistics: improvements?2019-12-18T19:15:02+01:00Vladimír Čunátvladimir.cunat@nic.czpolicy and statistics: improvements?- [ ] UX. Each rule in the list of policies has a `.count`, but it's not much useful as it is. It's not exported in usual statistics and introspecting by hand makes it hard to read the list.
```
[rules] => {
[1] => {
[count...- [ ] UX. Each rule in the list of policies has a `.count`, but it's not much useful as it is. It's not exported in usual statistics and introspecting by hand makes it hard to read the list.
```
[rules] => {
[1] => {
[count] => 40698
[id] => 0
[cb] => function: 0xb69374b0
}
}
```
##### Consider collecting more statistics:
- [ ] RPZ rules might additionally collect a counter of matches for each RPZ file line. That seems relatively cheap on performance side, but it's difficult in the way the abstractions are done now, as the `[cb]` (above) knows nothing about the "parent table".
- [ ] Count of "secure" answer would be interesting, i.e. those that would set AD flag if requested. (ATM the state isn't well visible unless the request had DO or AD.)
- [ ] e.g. inspiration https://pi-hole.nethttps://gitlab.nic.cz/knot/knot-resolver/-/issues/264errors from Lua module interface are not developer friendly2019-12-18T19:56:41+01:00Petr Špačekerrors from Lua module interface are not developer friendlyI'm creating a "Hello world" Lua plugin and the process is not straightforward as I would wish.
Interestingly if a Lua module does not return a table (which is easy to forget when you start), it spits out quite confusing error message:
...I'm creating a "Hello world" Lua plugin and the process is not straightforward as I would wish.
Interestingly if a Lua module does not return a table (which is easy to forget when you start), it spits out quite confusing error message:
```
> modules.load('test')
attempt to index a boolean value
```
I was looking into the C code which loads the Lua modules and it does not have any super-easy fix because of Lua-C integration. This lead me to idea that we might rewrite Lua-module loading into Lua, so it is not such a long spagetti. (Or not, if it does not simplify the code. I'm just thinking aloud.)https://gitlab.nic.cz/knot/knot-resolver/-/issues/47tools to migrate configuration from other resolver2018-12-17T13:32:08+01:00Marek Vavrusatools to migrate configuration from other resolverI wrote an Ansible scripts for Knot authoritative which might be a good starting point.
https://github.com/vavrusa/ansible-role-knotauth
... or something else, possibly YANG model.I wrote an Ansible scripts for Knot authoritative which might be a good starting point.
https://github.com/vavrusa/ansible-role-knotauth
... or something else, possibly YANG model.https://gitlab.nic.cz/knot/knot-resolver/-/issues/906local-data: allow even with +nord2024-03-04T10:24:29+01:00Vladimír Čunátvladimir.cunat@nic.czlocal-data: allow even with +nordWhile it makes sense to disallow *cached* records in +nord mode by default (for privacy reasons), those arguments do not hold for other kinds of local data, and there might be some use cases, e.g. [resolver.arpa. RESINFO](https://www.iet...While it makes sense to disallow *cached* records in +nord mode by default (for privacy reasons), those arguments do not hold for other kinds of local data, and there might be some use cases, e.g. [resolver.arpa. RESINFO](https://www.ietf.org/archive/id/draft-ietf-add-resolver-info-11.html#section-3)https://gitlab.nic.cz/knot/knot-resolver/-/issues/801multiple manager instances not runnable in parallel2023-09-28T04:48:33+02:00Vladimír Čunátvladimir.cunat@nic.czmultiple manager instances not runnable in parallelMultiple manager instances are not runnable in parallel, even if no socket or path from configuration clashes, e.g. when testing without containers.
It's prevented by the `sd_notify` plugin hardcoding the name of abstract unix socket to...Multiple manager instances are not runnable in parallel, even if no socket or path from configuration clashes, e.g. when testing without containers.
It's prevented by the `sd_notify` plugin hardcoding the name of abstract unix socket to `knot-resolver-control-socket`.