Bug #52781
closedshard-threads cannot wakeup bug
100%
Description
osd: fix shard-threads cannot wakeup bug
Reproduce:
(1) ceph cluster not running any client IO
(2) only ceph osd in osd.14 operation
Reason:
(1) one shard-queue has three shard-threads
(2) one or some PeeringOp's epoch > osdmap's epoch held by current osd,
and these PeeringOp _add_slot_waiter()
(3) shard-queue become empty and three shard-threads cond.wait()
(4) new osdmap consume and it _wake_pg_slot()
Problem in here
1> OSDShard::consume() exec loop all pg's slot wait
and requeue more than one PeeringOp to shard-queue
2> but it only notify one shard-thread to wakeup,
the other two shard-threads continue cond.wait()
3> OSD::ShardedOpWQ::_enqueue() found the shard-queue not empty
and not notify all shard-thread to wakeup
In a period of time, only one shard-thread of 3 shard-threads is running.