Project

General

Profile

Actions

Bug #38077

open

Marking all OSDs as "out" does not trigger a HEALTH_ERR state

Added by Lenz Grimmer about 5 years ago. Updated about 5 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Administration/Usability
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Just tested this on my local 5 OSD dev environment, but this likely applies to any given cluster: when setting the cluster-wide "noup" and "noin" flag to true and marking all OSDs in my cluster as "out" (not "down"), the cluster's health state is still in HEALTH_WARN only (because of the flags, not the OSD status), instead of HEALTH_ERR:

# ./bin/ceph -s
  cluster:
    id:     ed940f7b-187e-4ccf-b1ff-83e068acec95
    health: HEALTH_WARN
            noup,noin flag(s) set

  services:
    mon: 3 daemons, quorum a,b,c (age 2h)
    mgr: x(active, since 59m)
    mds: a:1 {0=b=up:active}, 1 up:standby
    osd: 5 osds: 5 up (since 66m), 0 in (since 2m); 48 remapped pgs
         flags noup,noin

  data:
    pools:   6 pools, 48 pgs
    objects: 51 objects, 6.0 KiB
    usage:   5.3 GiB used, 45 GiB / 50 GiB avail
    pgs:     255/153 objects misplaced (166.667%)
             48 active+clean+remapped

I wonder if OSDs being "out" should be handled similar to OSDs being "down" when it comes to the health state?

Actions #1

Updated by richael zhuang about 5 years ago

Hi,I don't know whether my opinion is right or not, but I think the status should be HEALTH_WARN when OSDs being marked "out", for "out" is manually set and once you reset it as "in",the cluster return to OK.

Actions

Also available in: Atom PDF