Improve handling of ConnectionError
In a ConnectionError happens, try and capture the information for which resolver it occurred. While it may be just a regular one-off network blip, it may also indicate a failure mode in one of the target resolvers. That is especially true if the same connection error happens repeatedly for a single resolver.
Prior to this change, the information about the resolver was lacking. This made it hard to assess whether the connection errors were caused by some network issue, or if it's likely that one of the target resolver is at fault.
Merge request reports
Activity
assigned to @isc-nicki
added 2 commits
added 2 commits
So the main addition here d0ed86ee which implements the new functionality and also has a bit of refactoring.
I've deployed this in our testing environment and respdiff works just fine, so I believe it can be safely merged.
One thing to note is that is a server actually crashes, then many more than just 3 messages may get generated, since the sendrecv is typically used by many jobs in parallel. However, since it's a fairly rare occasion and not regular operation, I think the extra data is worth a bit of spam in the log. If someone has an issue with that, the log level could be raised from DEBUG to INFO to hide those messages.
FTR, a crash may look like this. Previously, it was impossible to tell which server crashed, not the server's name is included in the log:
... 2025-02-05 15:26:38,254 DEBUG [testbind] [Errno 111] Connection refused 2025-02-05 15:26:38,255 DEBUG [testbind] [Errno 111] Connection refused 2025-02-05 15:26:38,255 DEBUG [testbind] [Errno 111] Connection refused 2025-02-05 15:26:38,256 DEBUG [testbind] [Errno 111] Connection refused 2025-02-05 15:26:38,242 DEBUG [testbind] [Errno 111] Connection refused 2025-02-05 15:26:38,257 DEBUG [testbind] [Errno 111] Connection refused 2025-02-05 15:26:38,257 DEBUG [testbind] [Errno 111] Connection refused 2025-02-05 15:26:38,257 DEBUG [testbind] [Errno 111] Connection refused 2025-02-05 15:26:38,258 DEBUG [testbind] [Errno 111] Connection refused 2025-02-05 15:26:38,259 DEBUG [testbind] [Errno 111] Connection refused 2025-02-05 15:26:38,259 DEBUG [testbind] [Errno 111] Connection refused 2025-02-05 15:26:38,259 DEBUG [testbind] [Errno 111] Connection refused 2025-02-05 15:26:38,261 DEBUG [testbind] [Errno 111] Connection refused 2025-02-05 15:26:38,481 ERROR ConnectionError received 3 times in a row (testbind, testbind, testbind), exiting!
I'll leave this open for a day or so and merge it then unless anyone objects.
mentioned in commit 0f8abd62