Project

General

Profile

Actions

Feature #7784

closed

mon osd down out interval = 0 should prevent ceph health from reporting ok

Added by Samuel Just about 10 years ago. Updated about 10 years ago.

Status:
Resolved
Priority:
High
Category:
Monitor
Target version:
% Done:

0%

Source:
other
Tags:
Backport:
Dumpling, Emperor
Reviewed:
Affected Versions:
Pull request ID:
Actions #1

Updated by Samuel Just about 10 years ago

  • Target version deleted (0.79)
Actions #2

Updated by Greg Farnum about 10 years ago

Perhaps this config option should be converted into the noout flag; that's already plumbed up for such things.

Actions #3

Updated by Ian Colle about 10 years ago

  • Target version set to 0.80
Actions #4

Updated by Joao Eduardo Luis about 10 years ago

  • Category set to Monitor
  • Assignee set to Joao Eduardo Luis
Actions #5

Updated by Joao Eduardo Luis about 10 years ago

Mapping a config option to a map flag is not an intuitive thing to do or to expect. What if the user injects a different value for this option? What if the leader changes? Once the flag is set, the user would have to explicitly unset it, and after injecting it (e.g.), the user would probably expect that injecting a different value would take care of the matter, which we definitely should not do (leader changed and have no idea such thing was done in the past, flag may have been set manually and the user don't really want it disabled, etc).

Having 'health' spit out that the leader has this option set would by far be the most correct approach imo. On the other hand, this is a config option that can be set on a single monitor, a few or all of them, even though it only matters if the leader has this option set. The way 'health' works is by spitting out what the monitor handling the health request knows about, and all of that is obtained from maps or from in-memory info the leader tracks down and shares across all the monitors in the quorum.

While setting the flag on the map would be easier from a reporting standpoint, we do have other mechanisms in place for other stuff (clock skew detection, data avail stats, etc) that are more adequate for this sort of thing -- so that's where I'm headed until someone objects :)

Actions #6

Updated by Joao Eduardo Luis about 10 years ago

  • Status changed from New to In Progress
Actions #7

Updated by Joao Eduardo Luis about 10 years ago

  • Status changed from In Progress to Fix Under Review

Went with the simplest approach: have the leader spit out the warning if it has the option set to zero. All other monitors will be oblivious about this, and the user will only get this report when it reaches out to the leader for 'status' or 'health' reports.

I've opened a feature (see #8150) to address the larger issue of having config options being disseminated from the leader to the peons.

Branch is up for review on https://github.com/ceph/ceph/pull/1692

Actions #8

Updated by Sage Weil about 10 years ago

  • Status changed from Fix Under Review to Pending Backport
  • Priority changed from Normal to High
Actions #9

Updated by Ian Colle about 10 years ago

  • Backport set to Dumpling, Emperor
Actions #10

Updated by Sage Weil about 10 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF