Project

General

Profile

Bug #44959

health warning: pgs not deep-scrubbed in time although it was in time

Added by Jonas Jelten almost 4 years ago. Updated almost 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi!

Some of my PGs are listed as "not scrubbed in time" in my 14.2.8 cluster.

My scrub settings are:

[osd]
# every week:
osd scrub min interval = 604800
# every month:
osd scrub max interval = 2678400
osd deep scrub interval = 2678400

There are no special scrub intervals set per-pool.

ceph health detail says:

PG_NOT_DEEP_SCRUBBED 2376 pgs not deep-scrubbed in time
    pg 14.3e5 not deep-scrubbed since 2020-03-24 19:14:13.414636
    pg 14.3cc not deep-scrubbed since 2020-03-24 13:36:59.600045
    pg 14.3bf not deep-scrubbed since 2020-03-23 21:48:00.772905
(...)

This doesn't seem to be right as this is less than one month from now (2020-04-06).
If I understand the code in PGMap.cc correctly, it should warn when a PG was last scrubbed earlier than:
if pool.last_deep_scrub_stamp < now - (deep_scrub_interval * mon_warn_pg_not_deep_scrubbed_ratio + deep_scrub_interval)
So that would be:
2020-03-24 < 2020-04-06 - (31 * 0.75 + 31)
<==>
2020-03-24 < 2020-02-11

The PGs can be manually deep-scrubbed, then they're no longer listed as "not in time":

ceph health detail | ag 'not deep-scrubbed since' | awk '{print $2}' | while read pg; do ceph pg deep-scrub $pg; done

What could be causing this?

History

#1 Updated by Greg Farnum almost 4 years ago

  • Project changed from Ceph to RADOS

#2 Updated by Katarzyna Myrek almost 4 years ago

Have you changed the values on the MGR? mgr checks that and if mgr still has defaults, it will issue warnings..

ceph config show-with-defaults mgr.INSTANCE | egrep "osd_deep_scrub_interval|mon_warn_pg_not_deep_scrubbed_ratio"

#3 Updated by Jonas Jelten almost 4 years ago

  • Status changed from New to Closed

Aaaha, that was it. Thank you very much!

I've set the osd deep scrub interval under [osd] so the mgr did not get this value.

After moving the setting under [global], the mgr gets it right and the warning is gone! Wohoo!

Also available in: Atom PDF