Global Recovery Event never completes
services: mon: 3 daemons, quorum a,b,c (age 29m) mgr: x(active, since 29m) osd: 6 osds: 6 up (since 17m), 6 in (since 29m) data: pools: 3 pools, 160 pgs objects: 0 objects, 0 B usage: 6.1 GiB used, 600 GiB / 606 GiB avail pgs: 160 active+clean progress: Global Recovery Event (5m) [==================..........] (remaining: 2m)
this is a OSD=6 vstart cluster that just ran qa/workunits/rados/test_python.sh, but i've seen this in other cases too.
#2 Updated by Kamoltat Sirivadhna 2 months ago
- Status changed from New to Fix Under Review
- Pull request ID set to 40480
Problem was that I did not subtract pgs that I skip because (reported_epoch_of_pg < start_epoch_of_event) from total_pg_num, this results in always (active_clean_pg < total_pg_num).
Therefore, a fix for this is just to subtract the pgs I skipped from total_pg_num.