Project

General

Profile

Actions

Bug #45393

closed

Containerized osd config must be updated when adding/removing mons

Added by Tim Serong about 4 years ago. Updated almost 4 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
cephadm
Target version:
% Done:

0%

Source:
Tags:
Backport:
octopus
Regression:
Yes
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Try this:

- bootstrap a cluster (1 mon, 1 mgr)
- add a bunch of osds (ceph orch apply osd --all-available-devices)
- add some more mons and mgrs (ceph orch apply mon 3 ; ceph orch apply mgr 3)

At this point, ceph.conf as seen by the osds (/var/lib/ceph/$FSID/osd.$ID/config) still only lists the first mon. If you restart any osds, and that first mon is down for some reason, the osd's can't join the cluster, because they don't know the other mons exist. They'll just sit there logging "monclient(hunting): authenticate timed out after 300" every five minutes until that first mon comes back.

I can think of two potential ways to address this:

1) Have cephadm mon apply go out and update every single osd config file with the new list of mons. This would of course not work completely if some osd hosts were down at the time. Also it might take a while...
2) Have the osds update their own config file automatically based on current monmaps.

This probably also needs to go into troubleshooting docs (check mon_host in each containerized osd's individual config file)


Related issues 1 (0 open1 closed)

Related to Orchestrator - Feature #45378: cephadm: manage /etc/ceph/ceph.confResolvedSebastian Wagner

Actions
Actions

Also available in: Atom PDF