use monotonic time
Use monotonic time instead of real time. Real time is used only for DNSSEC validation, on-disk cache, and for module dnstap.
Function kr_now() use libuv function uv_now() which also save some system calls.
Closes: #247 (closed)
Merge request reports
Activity
The only thing I'm unsure of is TTL handling in cache. This MR changes it from "real time" to a hybrid of real and monotonic time.
As I find it impossible to have it always follow purely monotonic time, I would slightly lean to keep it purely on "real time", just as DNSSEC. One reason for that is that we currently do not re-check the DNSSEC stamp when loading data from cache, only relying on the cache stamp. But still, I can't see way to avoid all bad influence of (significantly) skewed real time on DNSSEC, so it only makes a difference in cases that are bad already...
Personally I like the hybrid approach because it prevents weird stuff from happening when admin moves time for some reason. Example: Image that "real time" on the machine is one hour ahead and causing issues with particular domain. Admin will fix this problem by moving time "back" so real time is now okay. Side-effect of this change would be that all cached records will stay in cache for +1 hour, and this might break some other stuff.
On daemon restart we could detect this and flush cache if (real time now < real time recorded in cache).
What do you think?
Flushing cache is certainly a good precaution whenever we do detect a larger time shift (a couple minutes, perhaps). What remains is how to detect it.
I wonder what the monotonic clock syscalls actually return, and whether it's system-wide consistent until reboot (e.g. number of seconds since boot).
If I remember correctly @vkriz wanted to store real time somewhere and use monotonic clock during lifetime of process. This would allow us to detect that time was rewound back before the time when process was started, which is likely insufficient resolution...
I feel that it is getting way too complicated and we might revert to real time for cache+DNSSEC and use monotonic time for everything else.
During runtime it's easy: store a pair of real and monotonic time taken at once, and periodically check if their difference is roughly the same.
Edited by Vladimír ČunátLet's do this:
- keep using real time for cache stamps, so that it simply works across reboots if the time doesn't skew;
- detect (significant) time skew during kresd runtime, and clear cache whenever it happens.
Detection: https://gitlab.labs.nic.cz/knot/knot-resolver/merge_requests/392#note_61756 – it's not perfect, as we might theoretically get preempted between syscalls for a longer time interval, but with some tolerance of a few seconds I expect that to be reliable enough. We might get more advanced and e.g. do a "sandwich" of three syscalls to detect the long-term preemption: monotonic, real, monotonic + check that the two monotonic don't differ much.
Doing this mix, we should explicitly annotate the meaning of
struct kr_query::timestamp
and maybe have both times cached in there, too...added 77 commits
- 69a3f307...9e6666a4 - 76 commits from branch
master
- ecfa670a - use monotonic time
- 69a3f307...9e6666a4 - 76 commits from branch
added 2 commits
- ecfa670a - use monotonic time
- 0c2f5f1b - module: detect discontinuous jumps in the system time
added 1 commit
- 7902829c - module: detect discontinuous jumps in the system time
- Resolved by Petr Špaček
- Resolved by Petr Špaček
- Resolved by Petr Špaček
- Resolved by Petr Špaček
- Resolved by Petr Špaček
- Resolved by Petr Špaček
assigned to @vkriz
added 16 commits
- 7902829c...b18846d9 - 14 commits from branch
master
- 18ed8e4c - use monotonic time
- 27d2270c - module: detect discontinuous jumps in the system time
- 7902829c...b18846d9 - 14 commits from branch
assigned to @pspacek
- Resolved by Petr Špaček
- Resolved by Petr Špaček
- Resolved by Petr Špaček
- Resolved by Petr Špaček
assigned to @vkriz
added bug label
added 9 commits
- 27d2270c...b4ba22ff - 7 commits from branch
master
- 934674cc - use monotonic time
- faff699e - module: detect discontinuous jumps in the system time
- 27d2270c...b4ba22ff - 7 commits from branch
assigned to @pspacek
enabled an automatic merge when the pipeline for 06bc10f3 succeeds
mentioned in commit 2939bd78
@vcunat Please open an issue about this do we do not forget, this is worth investigating.
This also sort of mitigates NTP weaknesses like https://nlnetlabs.nl/downloads/presentations/The-impact-of-NTP-security-weaknesses-on-DNSSEC.pdf , at least before resolver restart.
mentioned in issue #392