Knot Resolver issueshttps://gitlab.nic.cz/knot/knot-resolver/-/issues2022-10-10T11:45:32+02:00https://gitlab.nic.cz/knot/knot-resolver/-/issues/761logging: consider adding startup and shutdown messages2022-10-10T11:45:32+02:00Matt Taggartlogging: consider adding startup and shutdown messagesI thought I was having a problem with my kresd.log as it wasn't getting updated. Then I realized mostly only errors get logged there and nothing is printed there on service start or stop. If I did a known bad query I could cause an updat...I thought I was having a problem with my kresd.log as it wasn't getting updated. Then I realized mostly only errors get logged there and nothing is printed there on service start or stop. If I did a known bad query I could cause an update there.
Please consider log entries on start/stop. I note that kres-cache-gc already does this on startup:
```kres-cache-gc[18916]: Knot Resolver Cache Garbage Collector, version 5.5.2```
so maybe something similar for kresd, and shutting down messages for both.
Thankshttps://gitlab.nic.cz/knot/knot-resolver/-/issues/754manager: datamodel: location for default values and constants2022-07-04T17:51:07+02:00Aleš Mrázekmanager: datamodel: location for default values and constantsWe should agree on location and definition of default values and constants. Some are currently defined in the configuration schema and some outside if it.
Issue follows the [comment](https://gitlab.nic.cz/knot/knot-resolver/-/merge_requ...We should agree on location and definition of default values and constants. Some are currently defined in the configuration schema and some outside if it.
Issue follows the [comment](https://gitlab.nic.cz/knot/knot-resolver/-/merge_requests/1280#note_256358) in !1280.https://gitlab.nic.cz/knot/knot-resolver/-/issues/752Protocol layers2022-07-18T10:01:11+02:00Oto ŠťávaProtocol layersSee snippet $1448See snippet $1448https://gitlab.nic.cz/knot/knot-resolver/-/issues/744tests/packaging: failing tests2022-06-01T14:06:50+02:00Oto Šťávatests/packaging: failing testsI'm opening this issue so that we are tracking these test fails somewhere, but I'm not sure what we can do about them.
* `centos_7`
* outdated `luarocks` (is `2.x` required `3.x`) - cannot install `process`. I've tried to resolve th...I'm opening this issue so that we are tracking these test fails somewhere, but I'm not sure what we can do about them.
* `centos_7`
* outdated `luarocks` (is `2.x` required `3.x`) - cannot install `process`. I've tried to resolve this by explicitly installing older `process`, that does not require the new `luarocks`, but it attempts to install the new version anyway
* `centos_8`
* appstream fails to prepare internal mirrorlist
* no such command `config-manager`
* `fedora_31`
* outdated `knot` - is `3.0.1`, required `3.0.2`
* `leap_15.2`
* package conflicts
Related MR: !1304 (adds better logging for failing commands)https://gitlab.nic.cz/knot/knot-resolver/-/issues/737knot-resolver crashes regularly on macOS 12.3.1 (intel and arm version), sinc...2022-06-20T11:53:32+02:00owahknot-resolver crashes regularly on macOS 12.3.1 (intel and arm version), since updating to 5.5.0Hello team,
with the latest update to knot-resolver on macOS, I've been experiencing many crashes and I don't know how to debug them, as the log stays empty.
I would be working/browsing and suddenly pages do not load anymore, if I chec...Hello team,
with the latest update to knot-resolver on macOS, I've been experiencing many crashes and I don't know how to debug them, as the log stays empty.
I would be working/browsing and suddenly pages do not load anymore, if I check on the service as you can see below, it would be in an error state. Running `brew services restart knot-resolver` fixes the issue up until the next crash.
```
$ sudo brew services 14:10:29
Password:
Name Status User File
dbus none
emacs none
knot none
knot-resolver started root /Library/LaunchDaemons/homebrew.mxcl.knot-resolver.plist
stubby none
tor none
unbound none
$ sudo brew services 14:24:54
Password:
Name Status User File
dbus none
emacs none
knot none
knot-resolver error 6 root /Library/LaunchDaemons/homebrew.mxcl.knot-resolver.plist
stubby none
tor none
unbound none
```
This is the config file I am using. DNSSEC is disabled because nextdns already validates it for me, and it used to create random SERVFAILs.
```
-- Network interface configuration
net.listen('127.0.0.1', 53, { kind = 'dns' })
--net.listen('127.0.0.1', 853, { kind = 'tls' })
--net.listen('127.0.0.1', 443, { kind = 'doh2' })
--net.listen('::1', 53, { kind = 'dns', freebind = true })
--net.listen('::1', 853, { kind = 'tls', freebind = true })
--net.listen('::1', 443, { kind = 'doh2' })
-- Load useful modules
modules = {
'hints > iterate', -- Allow loading /etc/hosts or custom root hints
'stats', -- Track internal statistics
'predict', -- Prefetch expiring/frequent records
}
log_level('err')
policy.add(policy.all(policy.TLS_FORWARD({
{'45.90.28.0', hostname='<removed>.dns1.nextdns.io'},
{'2a07:a8c0::', hostname='<removed>.dns1.nextdns.io'},
{'45.90.30.0', hostname='<removed>.dns2.nextdns.io'},
{'2a07:a8c1::', hostname='<removed>.dns2.nextdns.io'}
})))
trust_anchors.remove('.')
-- Cache size
cache.size = 100 * MB
```
My log file only shows:
```
...
[system] Knot Resolver is tested on Linux, other platforms might exhibit bugs.
Please report issues to https://gitlab.nic.cz/knot/knot-resolver/issues/
Thank you for your time and interest!
[system] Knot Resolver is tested on Linux, other platforms might exhibit bugs.
Please report issues to https://gitlab.nic.cz/knot/knot-resolver/issues/
Thank you for your time and interest!
[system] Knot Resolver is tested on Linux, other platforms might exhibit bugs.
Please report issues to https://gitlab.nic.cz/knot/knot-resolver/issues/
Thank you for your time and interest!
```
I had changed the log-level to debug already once too, but it seems that it crashes so hard, it doesn't have a chance to write anything to the log .
Any advice on how to get some better reporting going for this issue? It also surprises me that no one else reported this issue after upgrading to 5.5.0. At first I assumed that my machine is at fault, but only a few days ago I setup a brand new machine (macbook with ARM processor) and the crashing behaviour is the same.https://gitlab.nic.cz/knot/knot-resolver/-/issues/731knot resolver in docker in production2022-03-24T11:18:22+01:00elandorrknot resolver in docker in productionHello all,
I'm looking to run the resolver in prod docker. The authoritative knot works well in docker - unfortunately I saw you deprecated forking for the resolver.
Your docker image is just for testing purposes as you say and does no...Hello all,
I'm looking to run the resolver in prod docker. The authoritative knot works well in docker - unfortunately I saw you deprecated forking for the resolver.
Your docker image is just for testing purposes as you say and does not include garbage collection or a watchdog replacement.
Would you please let us know the best practices for accomplishing this?
I suppose we need:
- multiple processes for prod(as far as I can tell, in dockerland they're meant to be spawned and handled by the parent, but kresd is separate)
- therefore supervisord (Which is not all that nice, as the auth. knotd is relatively lightweight and handles itself without any additional bloat. I strive to keep things as light as possible, and wouldn't want to start creating frankensteins for the resolver either, if at all avoidable.)
- a way to run kres-cache-gc automatically from inside the container (not externally pushed as that'd be a bit of a 'hackjob')
A standard, 'official' solution would benefit many. We'd appreciate your input!
I tried to research other people's solutions, but it looks as though nobody has published about it yet. Everyone just keeps using unbound, especially in docker. I'd really like to give kresd a try as knotd is great! It even seems to be smaller than unbound.
Have a great evening!https://gitlab.nic.cz/knot/knot-resolver/-/issues/720Control sockets on relative paths fails2022-02-06T18:46:56+01:00Vaclav SraierControl sockets on relative paths failsWith this config:
```
local path = '/tmp/control/1'
local ok, err = pcall(net.listen, path, nil, { kind = 'control' })
if not ok then
log_warn(ffi.C.LOG_GRP_NETWORK, 'bind to '..path..' failed '..err)
end
```
everything works perfectl...With this config:
```
local path = '/tmp/control/1'
local ok, err = pcall(net.listen, path, nil, { kind = 'control' })
if not ok then
log_warn(ffi.C.LOG_GRP_NETWORK, 'bind to '..path..' failed '..err)
end
```
everything works perfectly.
This config though:
```
local path = './control/1'
local ok, err = pcall(net.listen, path, nil, { kind = 'control' })
if not ok then
log_warn(ffi.C.LOG_GRP_NETWORK, 'bind to '..path..' failed '..err)
end
```
Fails with this error message:
```
Feb 05 23:03:41 dingo kresd[169462]: [net ] bind to './control/1@53' (TCP): Invalid argument
Feb 05 23:03:41 dingo kresd[169462]: [net ] bind to ./control/1 failed error occurred here (config filename:lineno is at the bottom, if config is involved):
Feb 05 23:03:41 dingo kresd[169462]: stack traceback:
Feb 05 23:03:41 dingo kresd[169462]: [C]: at 0x556c94d0eae0
Feb 05 23:03:41 dingo kresd[169462]: [C]: in function 'pcall'
Feb 05 23:03:41 dingo kresd[169462]: kresd_1.conf:144: in main chunk
Feb 05 23:03:41 dingo kresd[169462]: ERROR: net.listen() failed to bind
```
It looks like the `kind` argument is completely ignored and defaults are assumed (UDP + TCP on port 53).
EDIT: Tested on `a2c339a57b8a6fb1c6bbaa83ed4bfdbe742a5fd0` (HEAD of `manager` branch)https://gitlab.nic.cz/knot/knot-resolver/-/issues/719Resolver returns SERVFAIL until restarted2023-09-28T04:49:56+02:00Jan BaierResolver returns SERVFAIL until restartedI am using knot-resolver 5.4.4-cznic.1 on Debian 10. After some (rather long) time, the resolver starts to return SERVFAIL for some records (those secured by DNSSEC).
From what I was able to find, I believe I stumbled upon a bug which m...I am using knot-resolver 5.4.4-cznic.1 on Debian 10. After some (rather long) time, the resolver starts to return SERVFAIL for some records (those secured by DNSSEC).
From what I was able to find, I believe I stumbled upon a bug which might be related to following issues:
* https://gitlab.nic.cz/knot/knot-resolver/-/issues/423
* https://gitlab.nic.cz/knot/knot-resolver/-/issues/493
It can be remediated quickly just by restarting the kresd service, which makes me thing if this is an issue in the resolver or rather in the Debian packaging (missing some restart hooks?).
From the log (full log attached) I can see:
1. There are several failed attempts to refresh trust anchors
`[taupd ] active refresh failed for . with rcode: 2`
2. After a few days (when the cache expires?) the problem starts to manifest itself and resolver starts to respond with SERVFAIL
```
[plan ][00000.00] plan 'haproxy.luffy.cx.' type 'A' uid [17896.00]
[iterat][17896.00] 'haproxy.luffy.cx.' type 'A' new uid was assigned .01, parent uid .00
[cache ][17896.01] => skipping exact RR: rank 060 (min. 030), new TTL -155800
[cache ][17896.01] => skipping unfit NS RR: rank 002, new TTL -76600
[cache ][17896.01] => skipping unfit NS RR: rank 002, new TTL -81800
[cache ][17896.01] => trying zone: ., NSEC, hash 0
[cache ][17896.01] => NSEC sname: range search miss (!covers)
[cache ][17896.01] => skipping zone: ., NSEC, hash 0;new TTL -123456789, ret -2
[zoncut][17896.01] found cut: . (rank 060 return codes: DS -2, DNSKEY -116)
[resolv][17896.01] >< TA: '.'
[plan ][17896.01] plan '.' type 'DNSKEY' uid [17896.02]
[iterat][17896.02] '.' type 'DNSKEY' new uid was assigned .03, parent uid .01
[cache ][17896.03] => skipping exact RR: rank 060 (min. 030), new TTL -5783
[cache ][17896.03] => trying zone: ., NSEC, hash 0
[cache ][17896.03] => NSEC sname: match but failed type check
[cache ][17896.03] => skipping zone: ., NSEC, hash 0;new TTL -123456789, ret -2
[select][00000.00] NO6: is KO [exploit]
[select][17896.03] => id: '28780' choosing: 'i.root-servers.net.'@'2001:7fe::53#00053' with timeout 10000 ms zone cut: '.'
[resolv][17896.03] => id: '28780' querying: 'i.root-servers.net.'@'2001:7fe::53#00053' zone cut: '.' qname: '.' qtype: 'DNSKEY' proto: 'tcp'
[worker][17896.03] => connecting to: '2001:7fe::53#00053'
[select][17896.03] NO6: timed out, but bad already
[select][17896.03] => id: '28780' noting selection error: 'i.root-servers.net.'@'2001:7fe::53#00053' zone cut: '.' error: 3 TCP_CONNECT_FAILED
[iterat][17896.03] '.' type 'DNSKEY' new uid was assigned .04, parent uid .01
[select][00000.00] NO6: is KO [exploit]
[select][17896.04] => id: '17180' choosing: 'm.root-servers.net.'@'2001:dc3::35#00053' with timeout 10000 ms zone cut: '.'
[resolv][17896.04] => id: '17180' querying: 'm.root-servers.net.'@'2001:dc3::35#00053' zone cut: '.' qname: '.' qtype: 'DNSKEY' proto: 'udp'
```
3. After restarting the service via `systemctl restart kresd@1` the problem instantly disappears
It seems to me like the resolver lost all root servers and needs a restart to reload them. Also, it might be good to mention, there is no IPv6 connectivity on the machine with the resolver.
I am not really sure, how to reproduce without waiting for a couple of days/weeks. This time, the issue appeared after 23 days.
Full log: [kresd.log](/uploads/0578a16d083ff60b5280a77ce4b99cfe/kresd.log)https://gitlab.nic.cz/knot/knot-resolver/-/issues/715integration of manager into kresd2023-09-28T04:50:13+02:00Tomas Krizekintegration of manager into kresdLet this issue be a checklist of requirements/ideas that need to be done before we're ready to merge manager into master. Feel free to edit the description and add your TODOs as well.
### Requirements
- [ ] config: verify that all valu...Let this issue be a checklist of requirements/ideas that need to be done before we're ready to merge manager into master. Feel free to edit the description and add your TODOs as well.
### Requirements
- [ ] config: verify that all values in the datamodel jinja2 templates are either (a) escaped or (b) validated before use (to prevent code injection from declarative values to lua) [goal: security - API should not be abusable] (related !1291)
- [ ] config: ensure all recently added lua configuration options have been added to declarative config as well (e.g. go through NEWS file and check) and make sure it won't be a problem in future.
- [x] new config for kresd < 5.5.0 !1289
- [ ] new config for kresd >= 5.5.0
- [x] new declarative policy module !1313
- [ ] config: update our default/example [configs](https://gitlab.nic.cz/knot/knot-resolver/-/tree/master/etc/config)
- [x] packaging: ensure all manager's dependencies have been properly added in `distro/pkg` (related !1248)
- [x] packaging: cover the most basic use-cases by packaging tests executed on all target distros (related #713)
- [ ] tests: manually test migration path on all target distros
- [x] usability: prepare [systemd files](https://gitlab.nic.cz/knot/knot-resolver/-/tree/master/systemd) for manager
- [x] usability: figure out how to support declarative config on unsupported platforms (CentOS7) and in our [docker image](https://gitlab.nic.cz/knot/knot-resolver/-/blob/master/Dockerfile) (related #734)
- [x] usability: ensure that the manager is applicable to ODVR usecase (separate workers/instances for each DNS protocol)
- [x] docs: document new way of using kresd with manager, including systemd interaction, quick start guide, declarative config docs, how to get logs etc.
### Suggestions
- [ ] tests: comprehensive unit tests of configuration: prepare a collection of example declarative configs and their lua counterparts; use CLI conversion tool to verify these
- [ ] logging: ensure logs from manager look consistent with kresd logs
- [x] logging: try to find a way to display aggregated log output
- [x] usability: support supervisord for containers
- [ ] usability: keep manager component optional (minimal use-case: only run config conversion, but use current kresd@1 approach)
- [ ] blog: blogpost(s) about the manager, comparison with `kresd@`, benefits, examples6.0.0https://gitlab.nic.cz/knot/knot-resolver/-/issues/714meson_version needs increasing2023-09-28T04:54:08+02:00daurnimatormeson_version needs increasingThe following warning appears when building:
```
Build targets in project: 31
WARNING: Project specifies a minimum meson_version '>=0.49' but uses features which were added in newer versions:
* 0.52.0: {'priority arg in test'}
NOTICE: F...The following warning appears when building:
```
Build targets in project: 31
WARNING: Project specifies a minimum meson_version '>=0.49' but uses features which were added in newer versions:
* 0.52.0: {'priority arg in test'}
NOTICE: Future-deprecated features used:
* 0.56.0: {'Dependency.get_pkgconfig_variable'}
```https://gitlab.nic.cz/knot/knot-resolver/-/issues/709datamodel: network: more readable 'kind' in listen interfaces2023-09-28T04:54:23+02:00Aleš Mrázekdatamodel: network: more readable 'kind' in listen interfaces- `dns-over-https` -> `doh`
- `dns-over-tls` -> `dot`- `dns-over-https` -> `doh`
- `dns-over-tls` -> `dot`https://gitlab.nic.cz/knot/knot-resolver/-/issues/707Add integration test with some complex configuration2023-09-28T04:54:43+02:00Vaclav SraierAdd integration test with some complex configurationFor example try to translate configuration from ODVR and see if it works. The ODVR configuration can be found in the discussion of issue knot-resolver-manager#38For example try to translate configuration from ODVR and see if it works. The ODVR configuration can be found in the discussion of issue knot-resolver-manager#38https://gitlab.nic.cz/knot/knot-resolver/-/issues/704Add tests for all quick start configuration snippets in kresd documentation2023-09-28T04:55:01+02:00Vaclav SraierAdd tests for all quick start configuration snippets in kresd documentationhttps://knot-resolver.readthedocs.io/en/stable/modules-policy.htmlhttps://knot-resolver.readthedocs.io/en/stable/modules-policy.htmlAleš MrázekAleš Mrázekhttps://gitlab.nic.cz/knot/knot-resolver/-/issues/691How to use static hints for local PTR records?2021-12-25T17:55:15+01:00Jon PolomHow to use static hints for local PTR records?Is it possible to use the static hints module to provide local PTR records? This is [hinted at](https://knot-resolver.readthedocs.io/en/stable/modules-hints.html#static-hints) in the documentation however no example is provided. Perhaps ...Is it possible to use the static hints module to provide local PTR records? This is [hinted at](https://knot-resolver.readthedocs.io/en/stable/modules-hints.html#static-hints) in the documentation however no example is provided. Perhaps I am misinterpreting what is possible with kresd so if that is the case, please clarify.https://gitlab.nic.cz/knot/knot-resolver/-/issues/700kresd process manager: decouple restarts from config change requests2022-11-19T20:51:24+01:00Vaclav Sraierkresd process manager: decouple restarts from config change requests- goals:
- increase throughput for config changes
- limitations:
- we can't make a config change faster as we have to restart everything
- proposed solution:
- keep track of config versions and restart `kresd`s continuously decoupl...- goals:
- increase throughput for config changes
- limitations:
- we can't make a config change faster as we have to restart everything
- proposed solution:
- keep track of config versions and restart `kresd`s continuously decoupled from requests. Mark request as finished when the config version of all `kresd`s is higherhttps://gitlab.nic.cz/knot/knot-resolver/-/issues/687serve_stale module doesn't provide stale answers when auths are unresponsive2022-03-09T11:16:31+01:00Tomas Krizekserve_stale module doesn't provide stale answers when auths are unresponsiveAs of version 5.4.2, `serve_stale` module doesn't work when auth servers are unresponsive (which is the typical case with network issues). The server selection algorithm tries very hard to resolve the request by re-trying different auth ...As of version 5.4.2, `serve_stale` module doesn't work when auth servers are unresponsive (which is the typical case with network issues). The server selection algorithm tries very hard to resolve the request by re-trying different auth servers and increasing their allowed timeouts, until the request ultimately times out and returns SERVFAIL instead of a stale answer.
If the auth servers are reachable but REFUSE to respond, the serve_stale module works as expected (that was our former test case with deckard).
Some notes about possible resolution:
- to be useful for clients, the stale answer should be provided quickly enough ([RFC 8767.5](https://datatracker.ietf.org/doc/html/rfc8767#section-5) suggests sending stale answer after 1.8s). The timeout used for serve_stale should ideally be configurable.
- the request resolution should keep going even after the stale answer is sent to the client to refresh data from slower auth severs (possible option: spawn a new duplicate internal request after providing the stale answer?)
- server selection should have a configurable time limit that is respected and allows serve_stale to activate in time
- the server selection time limit shouldn't be used unless serve_stale module is loaded _and_ there is a possible stale answer in the cachehttps://gitlab.nic.cz/knot/knot-resolver/-/issues/686Please document SOA included in authority section for queries within local (a...2021-11-13T10:32:21+01:00Sergio CallegariPlease document SOA included in authority section for queries within local (and how to avoid it)As mentioned in https://forum.turris.cz/t/avahi-local-domain-warning-on-ubuntu/13437, knot resolver answers any queries within local by NXDOMAIN but it adds this SOA in the authority section:
```
$ dig local
;; WARNING: .local is reserv...As mentioned in https://forum.turris.cz/t/avahi-local-domain-warning-on-ubuntu/13437, knot resolver answers any queries within local by NXDOMAIN but it adds this SOA in the authority section:
```
$ dig local
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 56352
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; QUESTION SECTION:
;local. IN A
;; AUTHORITY SECTION:
local. 10800 IN SOA local. nobody.invalid. 1 3600 1200 604800 10800
;; ADDITIONAL SECTION:
explanation.invalid. 10800 IN TXT “Blocking is mandated by standards, see references on https://www.iana.org/assignments/special-use-domain-names/special-use-domain-names.xhtml”
```
Unfortunately, this confuses `systemd-resolved` (maybe just older versions of it) and completely breaks mDNS name resolution on ubuntu focal (and possibly other distros).
What happens is as follows:
1. You do something like `ping foo.local`
2. Ubuntu focal has by default the host field in nsswitch.conf set to:
`hosts: files mdns4_minimal [NOTFOUND=return] dns`
so it tries the `/etc/hosts/` file and then mdns via the nss `mdns4_minimal` client
3. The `mdns4_minimal` client before doing anything else tries unicast DNS looking for a SOA for `local.` This mechanism is
present in the mdns4_minimal client to avoid issues when `local` in under DNS control and is documented at
https://github.com/lathiat/nss-mdns/blob/master/README.md
4. Ubuntu focal uses by default `systemd-resolved` as a caching DNS, so the query from `mdns4_minimal` gets to it
5. `systemd-resolved` passes the query to the DNS it is configured to use. If this is Knot resolver it gets that special SOA
in the authority section and turns it into a regular SOA reply (no NXDOMAIN)
6. `mdns4_minimal` receives a SOA reply for local and gives up
7. At this point DNS is queried. Back to `systemd-resolved` now trying to get the A field for `foo.local`.
8. By default `systemd-resolved` on ubuntu is configured not to do mDNS itself (even if it has this capability). Hence the
query at the previous point fails.
9. Rather than pinging foo.local you get an error.
I believe that:
- This is not a bug in knot resolver, rather a bug in `systemd-resolved` that makes itself confused by a legitimate answer
from knot resolver
- The issue in `systemd-resolved` may have been fixed in versions of systemd more recent than the one shipped in Ubuntu focal
(at least some quick testing on a rolling distro seems not to give the problem)
However, because:
1. Ubuntu Focal is extremely widespread
2. Ubuntu Focal is likely that it will not backport fixes to its `systemd-resolved` (because this is shipped in the `systemd` package that is quite delicate to touch)
3. The returning of the special SOA for things within `local` is something that older versions of knot resolver did not do
I believe that it could be worth adding an explicit note in the knot resolver documentation about the special SOA returned for queries within `local` and on how to avoid it in case it causes issues with mDNS name resolution.
I have observed that something like
```
policy.add(policy.suffix(policy.DROP, policy.todnames({'local.'})))
```
added to `kresd.conf` seems to be enough to workaround the problem, but I am not knowledgeable enough to know if this is the right solution.https://gitlab.nic.cz/knot/knot-resolver/-/issues/684ANSWER section not empty on SERVFAIL2021-11-04T10:58:48+01:00Tomas KrizekANSWER section not empty on SERVFAILIn some cases, the ANSWER section contains (unvalidated) data while the request ends with SERVFAIL.
In my specific conditions, the issue seems reproducible when:
- cache is clear
- IPv6 isn't available, but isn't turned off with net.ipv...In some cases, the ANSWER section contains (unvalidated) data while the request ends with SERVFAIL.
In my specific conditions, the issue seems reproducible when:
- cache is clear
- IPv6 isn't available, but isn't turned off with net.ipv6
- server selection chooses specific servers (and typically chooses the non-functioning IPv6 ones)
```
$ kdig @::1 -p 5553 +timeout=16 +edns signotincepted.bad-dnssec.wb.sidnlabs.nl
;; ->>HEADER<<- opcode: QUERY; status: SERVFAIL; id: 6998
;; Flags: qr rd ra; QUERY: 1; ANSWER: 1; AUTHORITY: 0; ADDITIONAL: 1
;; EDNS PSEUDOSECTION:
;; Version: 0; flags: ; UDP size: 1232 B; ext-rcode: NOERROR
;; QUESTION SECTION:
;; signotincepted.bad-dnssec.wb.sidnlabs.nl. IN A
;; ANSWER SECTION:
signotincepted.bad-dnssec.wb.sidnlabs.nl. 3600 IN A 94.198.159.39
;; Received 85 B
;; Time 2021-11-04 10:45:32 CET
;; From ::1@5553(UDP) in 10027.7 ms
```
See attached [log.txt](/uploads/8d1aa54458e26860a5d0f4e36d105cad/log.txt)https://gitlab.nic.cz/knot/knot-resolver/-/issues/683performance problem because of shared cache2021-10-26T11:50:06+02:00Hamza Kılıçperformance problem because of shared cacheI am making benchmarks for a project. And sending 10M queries to resolvers for test.
- Every test starts with cold start.
- Opening 8 process.
- measuring %core, pps, elapsed miliseconds, and download Mbps.
I founded an interesting ...I am making benchmarks for a project. And sending 10M queries to resolvers for test.
- Every test starts with cold start.
- Opening 8 process.
- measuring %core, pps, elapsed miliseconds, and download Mbps.
I founded an interesting result.
Opening 8 process with shared cache at the same folder (/var/cache/knot-resolver) vs 8 process with different cache folders
results look like these values (approximately)
- each core (every 8 cores)
- % 60 - %99
- pps
- 20000- 30000
- elapsed miliseconds
- 500 - 300
- download Mbps
- 100 - 160
conclusion: using shared cache slows down performance dramatically.
Is there a way to fix this problem?https://gitlab.nic.cz/knot/knot-resolver/-/issues/679DNSSEC failure on insecure subzone2021-10-23T10:08:30+02:00Tomas KrizekDNSSEC failure on insecure subzoneReported on [knot-resolver-users](https://lists.nic.cz/pipermail/knot-resolver-users/2021/000396.html) by Matthew Richardson
Attempting to resolve `213-133-203-34.newtel.in-addr.itconsult.net. PTR` ends up with a DNSSEC failure, even to...Reported on [knot-resolver-users](https://lists.nic.cz/pipermail/knot-resolver-users/2021/000396.html) by Matthew Richardson
Attempting to resolve `213-133-203-34.newtel.in-addr.itconsult.net. PTR` ends up with a DNSSEC failure, even tough the record itself is in an insecure subzone.
> The zone cut is between itconsult.net & newtel.in-addr.itconsult.net.
> Also whilst itconsult.net is DNSSEC signed, newtel.in-addr.itconsult.net is
> not. Thus, in-addr.itconsult.net is an empty non-terminal.
>
> If one asks for NS for newtel.in-addr.itconsult.net, thereafter resolution
> of the PTR then succeeds
```
[plan ][00000.00] plan '213-133-203-34.newtel.in-addr.itconsult.net.' type 'PTR' uid [51359.00]
[iterat][51359.00] '213-133-203-34.newtel.in-addr.itconsult.net.' type 'PTR' new uid was assigned .01, parent uid .00
[cache ][51359.01] => skipping exact RR: rank 027 (min. 030), new TTL 43131
[cache ][51359.01] => trying zone: itconsult.net., NSEC3, hash c75d4f37
[cache ][51359.01] => NSEC3 depth 3: hash uabfrhboj2pe1qnmfscd0adr77hqoirb
[cache ][51359.01] => NSEC3 encloser error for 213-133-203-34.newtel.in-addr.itconsult.net.: range search miss (!covers)
[cache ][51359.01] => NSEC3 depth 2: hash 7kdfmdhll7ee02vprj1oivl33lg5r7vu
[cache ][51359.01] => NSEC3 encloser error for newtel.in-addr.itconsult.net.: range search miss (!covers)
[cache ][51359.01] => NSEC3 depth 1: hash 4je672clu0jh2pbkm6mdj2n4ps7e9t2h
[cache ][51359.01] => NSEC3 encloser: only found existence of an ancestor
[cache ][51359.01] => skipping zone: itconsult.net., NSEC, hash 0;new TTL -123456789, ret -2
[zoncut][51359.01] found cut: itconsult.net. (rank 002 return codes: DS 0, DNSKEY 0)
[select][51359.01] => id: '47786' choosing: 'd.itconsult-dns.co.uk.'@'2001:67c:10b8::100#00053' with timeout 400 ms zone cut: 'itconsult.net.'
[resolv][51359.01] => id: '47786' querying: 'd.itconsult-dns.co.uk.'@'2001:67c:10b8::100#00053' zone cut: 'itconsult.net.' qname: 'iN-ADDR.iTConSult.neT.' qtype: 'NS' proto: 'udp'
[select][51359.01] NO6: timeouted, appended, timeouts 5/6
[select][51359.01] => id: '47786' noting selection error: 'd.itconsult-dns.co.uk.'@'2001:67c:10b8::100#00053' zone cut: 'itconsult.net.' error: 1 QUERY_TIMEOUT
[iterat][51359.01] '213-133-203-34.newtel.in-addr.itconsult.net.' type 'PTR' new uid was assigned .02, parent uid .00
[select][51359.02] => id: '56910' choosing: 'd.itconsult-dns.co.uk.'@'176.97.158.100#00053' with timeout 38 ms zone cut: 'itconsult.net.'
[resolv][51359.02] => id: '56910' querying: 'd.itconsult-dns.co.uk.'@'176.97.158.100#00053' zone cut: 'itconsult.net.' qname: 'in-aDdR.itCONsuLt.neT.' qtype: 'NS' proto: 'udp'
[select][51359.02] => id: '56910' updating: 'd.itconsult-dns.co.uk.'@'176.97.158.100#00053' zone cut: 'itconsult.net.' with rtt 18 to srtt: 18 and variance: 4
[iterat][51359.02] <= rcode: NOERROR
[iterat][51359.02] <= retrying with non-minimized name
[iterat][51359.02] '213-133-203-34.newtel.in-addr.itconsult.net.' type 'PTR' new uid was assigned .03, parent uid .00
[select][51359.03] => id: '18773' choosing: 'd.itconsult-dns.co.uk.'@'176.97.158.100#00053' with timeout 38 ms zone cut: 'itconsult.net.'
[resolv][51359.03] => id: '18773' querying: 'd.itconsult-dns.co.uk.'@'176.97.158.100#00053' zone cut: 'itconsult.net.' qname: '213-133-203-34.nEWtEL.IN-AdDr.ITcONsuLt.NEt.' qtype: 'PTR' proto: 'udp'
[select][51359.03] => id: '18773' updating: 'd.itconsult-dns.co.uk.'@'176.97.158.100#00053' zone cut: 'itconsult.net.' with rtt 16 to srtt: 18 and variance: 4
[iterat][51359.03] <= rcode: NOERROR
[valdtr][51359.03] >< cut changed, needs revalidation
[resolv][51359.03] => resuming yielded answer
[valdtr][51359.03] >< no valid RRSIGs found: 213-133-203-34.newtel.in-addr.itconsult.net. PTR (0 matching RRSIGs, 0 expired, 0 not yet valid, 0 invalid signer, 0 invalid label count, 0 invalid key, 0 invalid crypto, 0 invalid NSEC)
[plan ][51359.03] plan 'in-addr.itconsult.net.' type 'DS' uid [51359.04]
[iterat][51359.04] 'in-addr.itconsult.net.' type 'DS' new uid was assigned .05, parent uid .03
[cache ][51359.05] => trying zone: itconsult.net., NSEC3, hash c75d4f37
[cache ][51359.05] => NSEC3 depth 1: hash 4je672clu0jh2pbkm6mdj2n4ps7e9t2h
[cache ][51359.05] => NSEC3 sname: match proved NODATA, new TTL 43131
[iterat][51359.05] <= rcode: NOERROR
[valdtr][51359.05] <= parent: updating DS
[valdtr][51359.05] <= answer valid, OK
[resolv][51359.03] => resuming yielded answer
[valdtr][51359.03] >< no valid RRSIGs found: 213-133-203-34.newtel.in-addr.itconsult.net. PTR (0 matching RRSIGs, 0 expired, 0 not yet valid, 0 invalid signer, 0 invalid label count, 0 invalid key, 0 invalid crypto, 0 invalid NSEC)
[plan ][51359.03] plan 'in-addr.itconsult.net.' type 'DS' uid [51359.06]
[iterat][51359.06] 'in-addr.itconsult.net.' type 'DS' new uid was assigned .07, parent uid .03
[cache ][51359.07] => trying zone: itconsult.net., NSEC3, hash c75d4f37
[cache ][51359.07] => NSEC3 depth 1: hash 4je672clu0jh2pbkm6mdj2n4ps7e9t2h
[cache ][51359.07] => NSEC3 sname: match proved NODATA, new TTL 43131
[iterat][51359.07] <= rcode: NOERROR
[valdtr][51359.07] <= parent: updating DS
[valdtr][51359.07] <= answer valid, OK
[resolv][51359.03] => resuming yielded answer
[valdtr][51359.03] >< no valid RRSIGs found: 213-133-203-34.newtel.in-addr.itconsult.net. PTR (0 matching RRSIGs, 0 expired, 0 not yet valid, 0 invalid signer, 0 invalid label count, 0 invalid key, 0 invalid crypto, 0 invalid NSEC)
[valdtr][51359.03] <= continuous revalidation, fails
[cache ][51359.03] => not overwriting PTR 213-133-203-34.newtel.in-addr.itconsult.net.
[cache ][51359.03] => not overwriting PTR 213-133-203-34.newtel.in-addr.itconsult.net.
[dnssec] validation failure: 213-133-203-34.newtel.in-addr.itconsult.net. PTR
[resolv][51359.00] request failed, answering with empty SERVFAIL
[resolv][51359.03] finished in state: 8, queries: 2, mempool: 32800 B
```