Commits · 3a6c03b4108b28a1788253055497e3ceeeb30766 · Knot projects / Knot Resolver

Sep 09, 2020
- Merge !1026 : net: split the EDNS buffer size into upstream and downstream · 3a6c03b4
  Vladimír Čunát authored 4 years ago
  
  3a6c03b4
- net: split the EDNS buffer size into upstream and downstream · 1e32627c
  Vladimír Čunát authored 4 years ago
```
(Tiny nitpicks addressed by vcunat.)
```
  Verified
  
  1e32627c
- Merge !1055 : workarounds: remove *.in-addr.arpa.net NO_0X20 workarrounds · 88a4575a
  Vladimír Čunát authored 4 years ago
  
  88a4575a
- workarounds: remove *.in-addr.arpa.net NO_0X20 workarrounds · 967976a6
  Štěpán Balážik authored 4 years ago and Vladimír Čunát committed 4 years ago
```
The Internet has changed, turktel ones are fixed, edatel one does not
work at all.
```
  967976a6
Sep 08, 2020
- Merge branch 'release-5-1-3' into 'master' · 0c8d0bd4
  Tomas Krizek authored 4 years ago
```
release 5.1.3

See merge request !1059
```
  v5.1.3
  
  0c8d0bd4
- release 5.1.3 · 07a29f36
  Tomas Krizek authored 4 years ago
  
  Verified
  
  07a29f36
- Merge branch 'cache-forking' into 'master' · 76505e91
  Tomas Krizek authored 4 years ago
```
cache-forking fixes

See merge request !1042
```
  76505e91
- gc: NEWS, last fix for v5.1.3 · 68b822f4
  Petr Špaček authored 4 years ago
  
  Verified
  
  68b822f4
Sep 07, 2020

cache: fix race in assert_right_version · f986f462

Petr Špaček authored 4 years ago

This change fixes race condition in assert_right_version(). Racy
situation:
- Two instances have the (empty) cache open: New binary and old binary.
- New binary executes count() inside assert_right_version(), which
  internally starts RO transaction. Returned count is 0.
- Old binary does some writes (RW transaction parallel to RO in the first
  process).
- New binary skips cache clear because cache was empty at the time of check.
- Result: The old binary wrote data with an old format into cache which
  was not cleared and silenty changed version number to a new one.

This is not complete fix because we lack mechanism to detect cache format
change at run-time, but at least it removes one nasty corner case and
cost of this change seems to be minimal.

Verified

f986f462

lib/cache: switch .cachelock to fcntl() · b65e8977

Vladimír Čunát authored 4 years ago and

Petr Špaček committed 4 years ago

This gives us correctness, especially on "staleness" detection.
For simplicity we now don't remove "stale" .cachelock on opening cache,
but it doesn't obstruct us in any way (and overflow will remove it).

Verified

b65e8977

lib/cache: tweaks round transactions · e8ca6a70

Vladimír Čunát authored 4 years ago and

Petr Špaček committed 4 years ago

- The switched order is documented not to make difference,
  but it seems much clearer this way.
- MDB_TXN_FULL wasn't handled correctly (a reversed condition)
  and current LMDB code indicates that such transaction is
  not recoverable anyway... so we give up on trying.

Verified

e8ca6a70

lib/cache: avoid printing relative paths to cache · b1b57540
Vladimír Čunát authored 4 years ago and Petr Špaček committed 4 years ago

Verified

b1b57540
lib/cache: improve debugging prints · 45e90fb9
Petr Špaček authored 4 years ago
```
(This has shared authorship, basically, mostly from MR suggestions.)
```
Verified

45e90fb9

cache, GC: improve handling of LMDB maxsize · 7799595d

Vladimír Čunát authored 4 years ago and

Petr Špaček committed 4 years ago

This version seems to work OK.  Unfortunately we had to resort to
an extra write and cache reopening when attempting to set cache size.
And even so, decreasing the size can't really be done, so we only warn
about failing to do that.

Verified

7799595d

gc: print cache usage in every cycle if in verbose mode · 26382fde
Petr Špaček authored 4 years ago

Verified

26382fde
gc: verbose mode is now runtime option · 6e927fc7
Petr Špaček authored 4 years ago

Verified

6e927fc7
utils/cache_gc nitpick: more precise error prints · 64b22c32
Vladimír Čunát authored 4 years ago and Petr Špaček committed 4 years ago

Verified

64b22c32
utils/cache_gc nitpick: print time in milliseconds · 27cb7fe7
Vladimír Čunát authored 4 years ago and Petr Špaček committed 4 years ago
```
For the usual use cases, whole milliseconds seem to make more sense
than seconds with 10ms precision.
```
Verified

27cb7fe7
utils/cache_gc: comments and cleanup in kr_cache_gc() · 0a9be4e8
Vladimír Čunát authored 4 years ago and Petr Špaček committed 4 years ago

Verified

0a9be4e8

tests: fine tune integration test for GC · b983c77d

Vladimír Čunát authored 4 years ago and

Petr Špaček committed 4 years ago

TL;DR: tune the test - now it works quite reliably for me,
though it's perhaps not nice.

With 1 MiB cache it's not easy to avoid overflows, as the defaults are
meant for much larger sizes.  Normal GC target is to decrease usage
by 10% when above 80% in 100 records per transaction.  That just won't
work reliable due to 10% being only 25 pages.

This commit makes the test run GC with more suitable tuning and
frequently pauses kresd to give GC better chance to catch up.

Verified

b983c77d

tests: integration test for GC · f5cbc5a0
Petr Špaček authored 4 years ago
```
GC should prevent cache from overflowing.
```
Verified

f5cbc5a0
tests: integration test for cache overflow situation · d09085d3
Petr Špaček authored 4 years ago
```
Resolvers must answer queries even if the shared cache overflown during query processing.
```
Verified

d09085d3
lib/cache: run check_health() every five seconds · 7dc087e7
Vladimír Čunát authored 4 years ago and Petr Špaček committed 4 years ago
```
... in case of usage from kresd (GC does it a bit differently).
```
Verified

7dc087e7
lib/cache check_health(): also detect size changes · cd845d5f
Vladimír Čunát authored 4 years ago and Petr Špaček committed 4 years ago
```
This is important for GC - otherwise the usage computation would be
wrong after another process changed size (without replacing the file).
```
Verified

cd845d5f

lib/cache: abort() if emergency cache-clear fails · 532865cb

Vladimír Čunát authored 4 years ago and

Petr Špaček committed 4 years ago

As the code has been so far, there's no usable cache in that case
and some code just can't handle that.  Up to now we were getting
SIGSEGV from inside LMDB on the next attempted operation.

We might consider loosening preallocation in that case or even
retrying after a short sleep.  Systemd's restart after hold-off
timeout has an effect similar to the short sleep.

Verified

532865cb

utils/cache_gc: tolerate ESPACE unless twice in a row · a98f8abd

Vladimír Čunát authored 4 years ago and

Petr Špaček committed 4 years ago

In the unlikely case that GC happens "too late", it could fail when
deleting, in which case it seems best to reopen the cache and try again,
as it will probably be deleted by a kresd instance by the next interval.

Verified

a98f8abd

utils/cache_gc: avoid too long RO transactions · c651fbf2

Vladimír Čunát authored 4 years ago and

Petr Špaček committed 4 years ago

Until now the analyzing pass over full DB was taking place
in a single RO transaction.  For an unknown reason this caused kresd
processes to get MDB_MAP_FULL from mdb_put(), even though clearly there
were plenty free pages at that point.

Basic experiments show that 1k steps are OK and 10k steps are not.

Verified

c651fbf2

utils/cache_gc: handle one more error · ed9951c0
Vladimír Čunát authored 4 years ago and Petr Špaček committed 4 years ago
```
though I've never seen it happening.
```
Verified

ed9951c0

lib/cache: abort transactions on errors · 5f81153b

Vladimír Čunát authored 4 years ago and

Petr Špaček committed 4 years ago

This apparently gets rid of MDB_BAD_TXN failures that we were getting
when cache overflows. Unfortunately LMDB docs don't mention that
after operation failures one should abort the corresponding transaction.

Verified

5f81153b

lib/cache nitpick: more consistent naming · a239187e
Vladimír Čunát authored 4 years ago and Petr Špaček committed 4 years ago

Verified

a239187e
utils/cache_gc: utilize kr_cdb_api::check_health() · fd6d544a
Vladimír Čunát authored 4 years ago and Petr Špaček committed 4 years ago
```
Now it should keep working if the file has been replaced.
```
Verified

fd6d544a

WIP: lib/cache: factor out kr_cdb_api::check_health() · 383d8524

Vladimír Čunát authored 4 years ago and

Petr Špaček committed 4 years ago

FIXME: review, testing, etc.

A couple functions got folded into cdb_open_env(), as the split was
complicating situation (mainly around error handling).

Verified

383d8524

Merge branch 'upgrading' into 'master' · 1ef400ae
Petr Špaček authored 4 years ago
```
doc: upcoming changes

See merge request !1057
```
1ef400ae
doc: DNS Flag Day 2020 warning · 524dbbcf
Petr Špaček authored 4 years ago

Verified

524dbbcf
doc: DoH without TLS or over HTTP 1 is deprecated · 414d77d6
Petr Špaček authored 4 years ago

Verified

414d77d6
doc: new section in upgrading guide about upcoming changes · fc256e8c
Petr Špaček authored 4 years ago

Verified

fc256e8c
Merge branch 'luarocks-install-version' into 'master' · 10afcb0d
Petr Špaček authored 4 years ago
```
scripts, docs: specify lua version in `luarocks install`

Closes #601

See merge request !1052
```
10afcb0d

scripts, docs: specify lua version in `luarocks install` · 66b6352d

Vladimír Čunát authored 4 years ago and

Petr Špaček committed 4 years ago

On some systems luarocks defaults to other lua version (e.g. Fedora),
so the result would not be usable from kresd. I didn't touch scripts
for older distro versions (Debian < 10, Ubuntu < 20.04, CentOS 7).

Verified

66b6352d

Sep 01, 2020
- Merge branch 'libdnssec-3.0' into 'master' · 1c78497c
  Tomas Krizek authored 4 years ago
```
lib/dnssec: fix build against libdnssec 3.0

See merge request !1053
```
  1c78497c
- lib/dnssec: fix build against libdnssec 3.0 · 24c0f31b
  Vladimír Čunát authored 4 years ago
```
It hasn't been released yet, but this patch fixes build against
current Knot master already.
```
  Verified
  
  24c0f31b

Admin message

Admin message