improve TLS error handling
Closes #340
Merge request reports
Activity
added 7 commits
- 96f9fad7...b8f8249a - 5 commits from branch
master
- 071d5d84 - daemon: improved error handling
- bcded5d2 - daemon: improved error handling
- 96f9fad7...b8f8249a - 5 commits from branch
assigned to @pspacek
enabled an automatic merge when the pipeline for d9350443 succeeds
mentioned in commit a1f3feef
It hangs in this loop (worker.c#1547):
while (source_session->tasks.len > 0) { struct qr_task *t = source_session->tasks.at[0]; (void) qr_task_on_send(t, NULL, kr_error(EIO)); }
Edited by Marek Vavrusa1478 return state == KR_STATE_DONE ? 0 : kr_error(EIO); 1492 1479 } 1480 1481 /* Send back answer */ 1482 struct session *source_session = ctx->source.session; 1483 uv_handle_t *handle = source_session->handle; 1484 assert(source_session->closing == false); 1485 assert(handle && handle->data == ctx->source.session); 1486 assert(ctx->source.addr.ip.sa_family != AF_UNSPEC); 1487 int res = qr_task_send(task, handle, 1488 (struct sockaddr *)&ctx->source.addr, 1489 ctx->req.answer); 1490 if (res != kr_ok()) { 1491 while (source_session->tasks.len > 0) { 1492 struct qr_task *t = source_session->tasks.at[0]; 1493 (void) qr_task_on_send(t, NULL, kr_error(EIO)); It passes NULL handle, so it will never reach the session_del_tasks inside qr_task_on_send. I guess this is a copypasta error, as the same statement sequence was copied in multiple places in a8f2f543
@gdemidov Maybe you wanted to pass the
handle
here, that would hopefully get to the code path that removes the task from session.You can reproduce it by starting 2+ queries over the same connection and closing it (or change the
res != kr_ok()
condition to true to always cancel queries. The first one will be removed from the sesssion task list because it's finished, the rest of the queries are not, and the handle is NULL, so it will loop indefinitely:$ cat queries.dat navy.mil A www.navy.mil A www.nrl.navy.mil A $ getdns_query @127.0.0.1#8053 -a -s -L -B -F queries.dat
I think the correct solution is just to call
qr_task_on_send
on current task when send fails, and let the tasks on the same connection complete and then fail on send, and so naturally draining the task list. The cancellation here gets difficult when other tasks on the same connection have pending subrequests and timers.
added tests label
added feature label