Skip to content

Fix deadlock in event scheduler

Vitezslav Kriz requested to merge evsched-fix-deadlock into master

Situation for deadlock:

Main problem is in different order when locking multiple mutexes for same thread. Scheduler thread (function evsched_run) get first mutex scheduler->heap_lock then event->mx, function reschedule is called with locked event->mx and then lock scheduler->heap_lock (in evsched_schedule).

Solution:

New mutex instead of bool running in struct zone_events. Mutex mx protect only time array. Order of getting mutexes: mx, heap_lock, running. In case where it is possible running is locked with function trylock.

New possible problem: Event can be scheduled (in some race conditions) when event is running. Constraint for running only one event per zone is preserved, because event cannot be assigned to worker pool from scheduler when mutex running is locked.

Merge request reports