Bug #27988
closedWarn if queue of scrubs ready to run exceeds some threshold
0%
Description
The sched_scrub_pg set could be scanned during a new insert and the number of scrubs that are ready to be run could be counted and compared to some threshold. It would be nice if this triggered a monitor health warning.
Updated by David Zafman over 5 years ago
- Related to Bug #23576: osd: active+clean+inconsistent pg will not scrub or repair added
Updated by David Zafman over 5 years ago
- Subject changed from Warn if queue of scrubs exceeds some threshold to Warn if queue of scrubs ready to run exceeds some threshold
Updated by David Turner over 5 years ago
Talking with Sage, he believes there is already a warning status if you have scrubs that haven't run for more than 2x your interval. My experience in the related ticket was with a 30 day deep scrub interval and the repair was happening 3 weeks after it was issued. That indicates that I was within the existing 2x warning threshold but definitely beyond a healthy state.
Another idea that would help is to prioritize user submitted operations higher than automatically scheduled ones due to exceeding intervals.
Updated by David Zafman over 5 years ago
I'm want to fix 3 things here. First, user submitted scrubs are queued as due to occur immediately, but overdue scrubs are still prioritized before them. I want to have user submitted scrubs to run before all others. Second, I'd like to get a warning when too many scrubs are overdue. This could occur because too many user submitted scrubs are requested all at once, or because the system as configured can not keep up with the scrub demands. The could be disabled by default. Finally, the code to warn about overdue scrubs in the monitor is broken. It confuses the monitor's own scrubbing interval with pg scrubbing. It shouldn't use the mon_scrub_interval but rather osd_scrub_min_interval/osd_deep_scrub_interval when trying to assess how overdue scrubbing has gotten. Also, what about osd_scrub_max_interval? Also, should mon_warn_not_scrubbed and mon_warn_not_deep_scrubbed be renamed to mon_warn_pg_not_scrubbed and mon_warn_pg_not_deep_scrubbed respectively?
Updated by David Zafman over 5 years ago
- Status changed from New to In Progress
Updated by David Zafman over 5 years ago
- Related to Bug #37269: Prioritize user specified scrubs added
Updated by David Zafman over 5 years ago
- Related to Bug #37264: scrub warning check incorrectly uses mon scrub interval added
Updated by David Zafman about 5 years ago
- Status changed from In Progress to Need More Info
This is put on the back burner until we decide what to do next
Updated by David Zafman over 3 years ago
- Status changed from Need More Info to Rejected
- Pull request ID deleted (
23848)
This was already handled in a different but reasonable way by https://github.com/ceph/ceph/pull/15643 and refined by other changes.