Actions
Bug #38077
openMarking all OSDs as "out" does not trigger a HEALTH_ERR state
Status:
New
Priority:
Normal
Assignee:
-
Category:
Administration/Usability
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Just tested this on my local 5 OSD dev environment, but this likely applies to any given cluster: when setting the cluster-wide "noup" and "noin" flag to true and marking all OSDs in my cluster as "out" (not "down"), the cluster's health state is still in HEALTH_WARN only (because of the flags, not the OSD status), instead of HEALTH_ERR:
# ./bin/ceph -s cluster: id: ed940f7b-187e-4ccf-b1ff-83e068acec95 health: HEALTH_WARN noup,noin flag(s) set services: mon: 3 daemons, quorum a,b,c (age 2h) mgr: x(active, since 59m) mds: a:1 {0=b=up:active}, 1 up:standby osd: 5 osds: 5 up (since 66m), 0 in (since 2m); 48 remapped pgs flags noup,noin data: pools: 6 pools, 48 pgs objects: 51 objects, 6.0 KiB usage: 5.3 GiB used, 45 GiB / 50 GiB avail pgs: 255/153 objects misplaced (166.667%) 48 active+clean+remapped
I wonder if OSDs being "out" should be handled similar to OSDs being "down" when it comes to the health state?
Updated by richael zhuang about 5 years ago
Hi,I don't know whether my opinion is right or not, but I think the status should be HEALTH_WARN when OSDs being marked "out", for "out" is manually set and once you reset it as "in",the cluster return to OK.
Actions