Bug #10431
Updated by Loïc Dachary about 9 years ago
We were debugging a PG stuck at peering problem. It may due to peering event lost or not been handled. We found that some thread call osd->peering_queue.push_back without holding the osd_lock. It may cause a race condition when other threads (usually a dispatcher thread) push_back to peering_queue at the same time. We found at least when handling an FlushedEvt, the thread will push_back osd peering_queue. Can we add some checkers to assure the thread holds lock when doing osd->peering_wq.queue(PG*). * firefly equivalent change commit:852d7b5b3c019c02c042b767fc88916088e1a94d