Actions
Feature #13923
closedSet health to ERR when one or more PGs is stuck inactive
Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Monitor
Target version:
-
% Done:
0%
Source:
other
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:
Description
Based on this thread: http://article.gmane.org/gmane.comp.file-systems.ceph.user/25551
I would propose two additional settings:
mon_pg_inactive_max = 300
mon_pg_inactive_num = 1
In this case, if there is 1 or more PGs stuck inactive for more then 300 seconds the health state would go into ERR from WARN.
In RBD environments even one inactive PG can cause almost all I/O to stall since Block Devices hit so many different PGs.
Updated by Abhishek Lekshmanan over 8 years ago
- Status changed from New to Fix Under Review
master PR: https://github.com/ceph/ceph/pull/7253
Updated by Wido den Hollander about 7 years ago
- Status changed from Fix Under Review to Resolved
Actions