cache: zero-downtime restart is not supported across versions which change cache format/version
Currently we do not handle the case where cache format differs between two versions which are running in parallel.
- Such changes happen very very rarely so it is questionable if we need to support it.
- At least we should make note in release notes when it is necessary to stop all instances before starting new ones.
See rest of the discussion.
The following discussion from !1042 (merged) should be addressed:
-
@pspacek started a discussion: (+1 comment) I wonder how this magic would work in situation where:
- kresd instances 1+2 are running version 5.y.z with cache in /var/cache/knot-resolver
- kresd binary gets updated to version 6.0.0
- admin restarts instance 1 first (according to https://knot-resolver.readthedocs.io/en/v5.1.2/systemd-multiinst.html#zero-downtime-restarts) and restarts instance 2 later I guess instance 2 would not detect this unless cache overflows, so most likely instance 2 will write data in old format into cache versioned by version 6.0.0.
Am I correct?
If so I think we should open issue and keep it in mind for future cache rewrite/migration to custom data structure.