rewrite server selection system

Current server selection mechanism is not well defined, and sometimes exhibits hard-to-debug quirks. This is ticket for collecting ideas what we need from a proper server selection system.

Caveats

look for an existing literature about server selection!
forwarding and iteration probably need different algorithms!
what should be the overall criteria? lowest RTT? reliability? lowest RTT when taking reliability into account? :-)
- can we map this to multi-armed bandit (or some other) model in statistics?
verify that it is okay to operate with server == IP address mapping
- multiple NS names can map to a single IP address
- NS names are probably not significant, properties could be associated with IP addresses
- think about unresolved NS names/incomplete glue
  - consider lazy NS name -> IP address resolving if we have enough working servers
- what about anycast nodes with different properties? is it worth considering, or just unsupported configuration? read related RFCs about anycast DNS operation
server selection probably needs to include transport protocol selection for each IP address - UDP, TCP, TLS, DTLS, QUIC, DoH, ...
some errors (REFUSED, SERVFAIL, ...) are not property of an IP address but in fact are property of (IP address, zone) pair
- e.g. one lame delegation to a name server of big web hosting company should not penalize NS IP address as whole
transport protocols are likely to have different properties/statistics - RTT, reliability, etc.
think about TLS-to-auth auto discovery
how can we incorporate https://tools.ietf.org/html/draft-ietf-dnsop-extended-error draft?
properties can change over time so our stats need to expire

Ideas for attributes

IP address

supported EDNS version version (to avoid FORMERR loops, but maybe we need only per-query state ...)
supported transport protocols (TLS configuration etc.)
DNS cookies

(IP address, protocol)

RTT
transport layer "reliability" (maybe timeouts should not be mixed with RTT ...)
transport protocol information (cached TLS certificate, session resumption, 0-RTT data support, ...)

(IP address, zone)

usefulness - ok, SERVFAIL, REFUSED, BOGUS (lame delegations, expired zone data etc.)

Obviously storing (server, zone) attributes might lead to state explosion. We need to think twice about this. Maybe there is a way to optimize, e.g. store only "broken" (server, zone) pairs so we can penalize these during server selection but do not bother with vast majority of "working" pairs.

Assorted ideas

Serve stale

timestamp of last attempt
SERVFAIL a ok per server?
counters for DoS mitigation (query per zone per server or ...)

Edited Feb 19, 2019 by Petr Špaček

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information

Admin message