daemon: handle IO error when processing wire buffer
This fixes the following assert:
daemon/worker.c:1157: qr_task_finalize: Assertion '!session_flags(source_session)->closing' failed.
Scenario which leads to the above error:
- We're using a stateful protocol.
- Enough data arrive in a single
tcp_recv()
call to put at least two queries into the session's wire buffer. -
session_wirebuf_process()
callsworker_submit()
which callsqr_task_step()
. - In the
qr_task_step()
the query state changes toKR_STATE_DONE
, thenqr_task_finalize()
is called. -
qr_task_send()
is called, but it fails. This is whereqr_task_finalize()
closes the session, but used to return no error. - When the next query is processed in
session_wirebuf_process()
, steps 3 and 4 are followed andqr_task_finalize()
is called. - Since the session is already closed, the assert is triggered.
Debugging this was a lot of fun... All hail the rr debugger!
Merge request reports
Activity
added bug label
- Resolved by Petr Špaček
@tkrizek Do you see a way to add this scenation into pytests/connection tests?
mentioned in issue #606
added 114 commits
-
848b7f8e...1c78497c - 110 commits from branch
master
- 84efb79c - daemon: handle IO error when processing wire buffer
- 4cad535f - daemon/worker: fix connection teardown in tls_hs_cb()
- 52b4b272 - daemon/worker: handle errors more gracefully in qr_task_finalize()
- ef071871 - fixup! daemon: handle IO error when processing wire buffer
Toggle commit list-
848b7f8e...1c78497c - 110 commits from branch
added 1 commit
- a97bb8e1 - daemon/session: process entire packet buffer
added 1 commit
- 5371fbc2 - WIP: keep half-broken connection open for receving
added 1 commit
- 712b094c - fixup! WIP: keep half-broken connection open for receving
added 1 commit
- 2321fd1f - WIP: keep upstream conn open on unrelated write failures
I've seen some memory leaks in resperf, but these are in other branches as well, so I'm convinced they're unrelated to this change.
https://gitlab.nic.cz/knot/knot-resolver/-/jobs/428881
kresd_fwd_target_1 | ==1==ERROR: LeakSanitizer: detected memory leaks kresd_fwd_target_1 | kresd_fwd_target_1 | Indirect leak of 16400 byte(s) in 1 object(s) allocated from: kresd_fwd_target_1 | #0 0x7fda35d6ebc8 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dbc8) kresd_fwd_target_1 | #1 0x55576b674ab3 in mp_new_big_chunk ../contrib/ucw/mempool.c:92 kresd_fwd_target_1 | #2 0x55576b674c15 in mp_new_chunk ../contrib/ucw/mempool.c:120 kresd_fwd_target_1 | #3 0x55576b675b28 in mp_alloc_internal ../contrib/ucw/mempool.c:274 kresd_fwd_target_1 | #4 0x55576b673f2a in mp_alloc_fast ../contrib/ucw/mempool.h:198 kresd_fwd_target_1 | #5 0x55576b675d94 in mp_alloc ../contrib/ucw/mempool.c:304 kresd_fwd_target_1 | #6 0x7fda35c35ae6 in knot_dname_copy libknot/dname.c:149 kresd_fwd_target_1 | kresd_fwd_target_1 | Indirect leak of 16400 byte(s) in 1 object(s) allocated from: kresd_fwd_target_1 | #0 0x7fda35d6ebc8 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dbc8) kresd_fwd_target_1 | #1 0x55576b674ab3 in mp_new_big_chunk ../contrib/ucw/mempool.c:92 kresd_fwd_target_1 | #2 0x55576b674c15 in mp_new_chunk ../contrib/ucw/mempool.c:120 kresd_fwd_target_1 | #3 0x55576b674d13 in mp_new ../contrib/ucw/mempool.c:139 kresd_fwd_target_1 | #4 0x55576b659bce in pool_borrow ../daemon/worker.c:226 kresd_fwd_target_1 | #5 0x55576b65a043 in request_create ../daemon/worker.c:266 kresd_fwd_target_1 | #6 0x55576b6636c0 in worker_submit ../daemon/worker.c:1597 kresd_fwd_target_1 | #7 0x55576b64d87d in session_wirebuf_process ../daemon/session.c:721 kresd_fwd_target_1 | #8 0x55576b63b335 in udp_recv ../daemon/io.c:88 kresd_fwd_target_1 | #9 0x7fda35bd75ae (/lib/x86_64-linux-gnu/libuv.so.1+0x1e5ae) kresd_fwd_target_1 | kresd_fwd_target_1 | SUMMARY: AddressSanitizer: 32800 byte(s) leaked in 2 allocation(s).
changed milestone to %5.2.0
@sbalazik Please review this. The most important part is adding condition
session_flags(source_session)->closing
, the rest is more or less just cosmetics.assigned to @sbalazik
mentioned in merge request !1058 (merged)
added 36 commits
-
e3d6224e...0c8d0bd4 - 34 commits from branch
master
- a7977efc - daemon: handle IO error when processing wire buffer
- bb2b456c - daemon/worker: fix connection teardown in tls_hs_cb()
-
e3d6224e...0c8d0bd4 - 34 commits from branch