Bug #51049
closedcephadm removed mon. key when adding new mon node
0%
Description
This morning I tried adding a mon node to my home Ceph cluster (mid-upgrade from 15.2.13 to 16.2.4) with the following command:
ceph orch daemon add mon ether
This seemed to work at first, but then it decided to remove it fairly quickly which broke the cluster because the mon. keyring was also removed:
2021-06-01T14:16:11.523210+0000 mgr.paris.glbvov [INF] Deploying daemon mon.ether on ether
2021-06-01T14:16:43.621759+0000 mgr.paris.glbvov [INF] Safe to remove mon.ether: not in monmap (['paris', 'excalibur'])
2021-06-01T14:16:43.622135+0000 mgr.paris.glbvov [INF] Removing monitor ether from monmap...
2021-06-01T14:16:43.641365+0000 mgr.paris.glbvov [INF] Removing daemon mon.ether from ether
2021-06-01T14:16:46.610283+0000 mgr.paris.glbvov [INF] Removing key for mon.
Digging in to this it seems like this line might need to check for 'mon.' and not 'mon':
https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/services/cephadmservice.py#L486
Updated by Bryan Stillwell almost 3 years ago
As I said in ceph-users, this didn't actually remove the mon. key:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/7RQUBZ2IR6FRKTFQJGOQP4U2RM3FMKGE/
The new mon node came up and couldn't join properly because it was using an image that didn't support global id reclaim.
You can close this ticket.