Project

General

Profile

Actions

Bug #51049

closed

cephadm removed mon. key when adding new mon node

Added by Bryan Stillwell almost 3 years ago. Updated almost 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This morning I tried adding a mon node to my home Ceph cluster (mid-upgrade from 15.2.13 to 16.2.4) with the following command:

ceph orch daemon add mon ether

This seemed to work at first, but then it decided to remove it fairly quickly which broke the cluster because the mon. keyring was also removed:

2021-06-01T14:16:11.523210+0000 mgr.paris.glbvov [INF] Deploying daemon mon.ether on ether
2021-06-01T14:16:43.621759+0000 mgr.paris.glbvov [INF] Safe to remove mon.ether: not in monmap (['paris', 'excalibur'])
2021-06-01T14:16:43.622135+0000 mgr.paris.glbvov [INF] Removing monitor ether from monmap...
2021-06-01T14:16:43.641365+0000 mgr.paris.glbvov [INF] Removing daemon mon.ether from ether
2021-06-01T14:16:46.610283+0000 mgr.paris.glbvov [INF] Removing key for mon.

Digging in to this it seems like this line might need to check for 'mon.' and not 'mon':

https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/services/cephadmservice.py#L486

Actions #1

Updated by Bryan Stillwell almost 3 years ago

As I said in ceph-users, this didn't actually remove the mon. key:

https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/7RQUBZ2IR6FRKTFQJGOQP4U2RM3FMKGE/

The new mon node came up and couldn't join properly because it was using an image that didn't support global id reclaim.

You can close this ticket.

Actions #2

Updated by Neha Ojha almost 3 years ago

  • Status changed from New to Closed
Actions

Also available in: Atom PDF