Bug #48939
closedOrchestrator removes mon daemon from wrong host when removing host from cluster
0%
Description
It is as shocking as the subject describes.
To summarize:
Removing host mon1 nukes mon.mon3
Removing host mon3 nukes mon.mon1
Removing host mon2 nukes mon.mon3
Long description:
My cluster exists of 9 physical nodes, 3x mon/mgr and 6x storage. While debugging an issue with my mgr's that hang every 7-14 days I decided to completely remove a mon/mgr host from my cluster.
I am running ceph 15.2.8 on top of ubuntu 20.04 with docker 19.03.13. Using the orchestrator with cephadm. This cluster was created with either 15.2.1 or 15.2.2. Fresh, not an upgraded cluster
When I remove mon3 from my cluster with `ceph orch host rm mon3.ceph2.example.net` the mon daemon on mon1 gets removed, monitor quorum goes to 2/2 with mon2 and mon3.
Only the monitor daemon on mon1 gets removed. The other daemons running on mon1 are untouched. All daemons running on mon3 are also untouched.
ceph orch ls NAME RUNNING REFRESHED AGE PLACEMENT IMAGE NAME IMAGE ID crash 9/9 78s ago 5w * docker.io/ceph/ceph:v15.2.8 5553b0cb212c mgr 3/3 78s ago 5w mon* docker.io/ceph/ceph:v15.2.8 5553b0cb212c mon 3/3 78s ago 5w mon* docker.io/ceph/ceph:v15.2.8 5553b0cb212c osd.nvme_drive_group 30/30 75s ago 11w node* docker.io/ceph/ceph:v15.2.8 5553b0cb212c root @ mon1.ceph2 # ceph orch host ls HOST ADDR LABELS STATUS mon1.ceph2.example.net mon1.ceph2.example.net mon2.ceph2.example.net mon2.ceph2.example.net mon3.ceph2.example.net mon3.ceph2.example.net node1.ceph2.example.net node1.ceph2.example.net node2.ceph2.example.net node2.ceph2.example.net node3.ceph2.example.net node3.ceph2.example.net node4.ceph2.example.net node4.ceph2.example.net node5.ceph2.example.net node5.ceph2.example.net node6.ceph2.example.net node6.ceph2.example.net root @ mon3.ceph2 # ceph orch host rm mon3.ceph2.example.net Removed host 'mon3.ceph2.example.net' root @ mon1.ceph2 # ceph -w cluster: id: d77f7c4a-d656-11ea-95cb-531234b0f844 health: HEALTH_WARN 2 stray daemons(s) not managed by cephadm 1 stray host(s) with 2 daemon(s) not managed by cephadm services: mon: 2 daemons, quorum mon2,mon3 (age 2m) mgr: mon2.iqrtrf(active, since 35m), standbys: mon1.rruwfr, mon3.dpuxam osd: 180 osds: 180 up (since 2w), 180 in (since 5w) data: pools: 2 pools, 2049 pgs objects: 2.14M objects, 8.2 TiB usage: 24 TiB used, 31 TiB / 55 TiB avail pgs: 2049 active+clean io: client: 292 KiB/s rd, 6.7 MiB/s wr, 18 op/s rd, 597 op/s wr
Adding mon3 back to the cluster with `ceph orch host add mon3.ceph2.example.net mon3.ceph2.example.net`, it immediately spawns me a monitor daemon on mon1 but it is not added to the quorum. I have to manually do this with `ceph mon add mon1 2001:db8::xxx`
I have repeated the process 3 times with mon3, it happened 3 times.
When I `ceph orch host rm mon1.ceph2.example.net`, the same thing happens but the other way around. The monitor daemon on mon3 gets nuked, quorum goes to mon1/mon2. After adding the host to the orchestrator I have to manually add the monitor to get my 3/3 quorum back.
When I remove mon2 from the cluster, the monitor on mon3 gets nuked as well.
~ root @ mon3.ceph2 # ceph orch host rm mon2.ceph2.example.net Removed host 'mon2.ceph2.example.net' ~ root @ mon3.ceph2 # ceph -w cluster: id: d77f7c4a-d656-11ea-95cb-531234b0f844 health: HEALTH_WARN 2 stray daemons(s) not managed by cephadm 1 stray host(s) with 2 daemon(s) not managed by cephadm services: mon: 2 daemons, quorum mon2,mon1 (age 0.120675s) mgr: mon2.iqrtrf(active, since 17m), standbys: mon3.dpuxam, mon1.rruwfr osd: 180 osds: 180 up (since 2w), 180 in (since 5w) data: pools: 2 pools, 2049 pgs objects: 2.14M objects, 8.2 TiB usage: 24 TiB used, 31 TiB / 55 TiB avail pgs: 2049 active+clean io: client: 18 MiB/s rd, 21 MiB/s wr, 340 op/s rd, 646 op/s wr 2021-01-20T16:54:07.487722+0100 mon.mon1 [INF] mon.mon1 calling monitor election 2021-01-20T16:54:09.473600+0100 mon.mon2 [INF] mon.mon2 calling monitor election 2021-01-20T16:54:09.497733+0100 mon.mon2 [INF] mon.mon2 is new leader, mons mon2,mon1 in quorum (ranks 0,1) 2021-01-20T16:54:09.518510+0100 mon.mon2 [INF] overall HEALTH_OK 2021-01-20T16:54:09.558924+0100 mon.mon2 [WRN] Health check failed: 2 stray daemons(s) not managed by cephadm (CEPHADM_STRAY_DAEMON) 2021-01-20T16:54:09.558962+0100 mon.mon2 [WRN] Health check failed: 1 stray host(s) with 2 daemon(s) not managed by cephadm (CEPHADM_STRAY_HOST)
## mon3 ~ root @ mon3.ceph2 # docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 4c702fb9d8c8 ceph/ceph:v15.2.8 "/usr/bin/ceph-mgr -…" 39 minutes ago Up 39 minutes ceph-d77f7c4a-d656-11ea-95cb-531234b0f844-mgr.mon3.dpuxam ffb030ce25a7 ceph/ceph:v15.2.8 "/usr/bin/ceph-crash…" 2 weeks ago Up 2 weeks ceph-d77f7c4a-d656-11ea-95cb-531234b0f844-crash.mon3 ## mon2 ~ root @ mon2.ceph2 # docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 874af2c2a8f9 ceph/ceph:v15.2.8 "/usr/bin/ceph-mon -…" 2 hours ago Up 2 hours ceph-d77f7c4a-d656-11ea-95cb-531234b0f844-mon.mon2 1e09357dbfed ceph/ceph:v15.2.8 "/usr/bin/ceph-crash…" 2 hours ago Up 2 hours ceph-d77f7c4a-d656-11ea-95cb-531234b0f844-crash.mon2 5baf2d4860d0 ceph/ceph:v15.2.8 "/usr/bin/ceph-mgr -…" 2 hours ago Up 2 hours ceph-d77f7c4a-d656-11ea-95cb-531234b0f844-mgr.mon2.iqrtrf
After adding mon2 back to the orchestrator, a monitor daemon is spawned on mon3 but it is not added to the quorum. I have to manually do this.
~ root @ mon3.ceph2 # ceph mon add mon3 2001:db8:0:11d::fd adding mon.mon3 at [v2:[2001:db8:0:11d::fd]:3300/0,v1:[2001:db8:0:11d::fd]:6789/0] ~ root @ mon3.ceph2 # ceph -w cluster: id: d77f7c4a-d656-11ea-95cb-531234b0f844 health: HEALTH_OK services: mon: 3 daemons, quorum mon2,mon1,mon3 (age 0.276922s) mgr: mon3.dpuxam(active, since 5m), standbys: mon1.rruwfr, mon2.iqrtrf osd: 180 osds: 180 up (since 2w), 180 in (since 5w) data: pools: 2 pools, 2049 pgs objects: 2.14M objects, 8.2 TiB usage: 24 TiB used, 31 TiB / 55 TiB avail pgs: 2049 active+clean io: client: 41 KiB/s rd, 12 MiB/s wr, 2 op/s rd, 542 op/s wr 2021-01-20T17:01:39.284854+0100 mon.mon1 [INF] mon.mon1 calling monitor election 2021-01-20T17:01:39.309379+0100 mon.mon3 [INF] mon.mon3 calling monitor election 2021-01-20T17:01:44.315558+0100 mon.mon3 [INF] mon.mon3 calling monitor election 2021-01-20T17:01:44.319832+0100 mon.mon1 [INF] mon.mon1 calling monitor election 2021-01-20T17:01:44.327461+0100 mon.mon2 [WRN] Health check failed: 1/3 mons down, quorum mon2,mon1 (MON_DOWN) 2021-01-20T17:01:44.329326+0100 mon.mon2 [INF] overall HEALTH_OK 2021-01-20T17:01:44.329347+0100 mon.mon2 [INF] mon.mon2 calling monitor election 2021-01-20T17:01:44.355469+0100 mon.mon2 [INF] mon.mon2 is new leader, mons mon2,mon1,mon3 in quorum (ranks 0,1,2) 2021-01-20T17:01:44.375399+0100 mon.mon2 [INF] Health check cleared: MON_DOWN (was: 1/3 mons down, quorum mon2,mon1) 2021-01-20T17:01:44.375422+0100 mon.mon2 [INF] Cluster is now healthy 2021-01-20T17:01:44.383761+0100 mon.mon2 [INF] overall HEALTH_OK
This cluster runs critical production so I am not too keen on poking it. My debug level is set to 5 for my mgr issues but the mgr logs don't show me anything special
I have checked the hostnames, ip addresses and dns records 10x to make sure there is no error there. But since removing both host mon1 and mon2 causes the mon.mon3 to be deleted I don't think dns is the issue here.
If any further information is required I'll be happy to supply you with it.
Updated by Daniël Vos over 3 years ago
Quick update to confirm this behavior. I have been able to reproduce this on my personal homelab ceph cluster, also running 15.2.8. Removing `node3` from my homelab cluster nukes the mon.node2 daemon and leaves me with a 2/2 quorum existing of node1 and node3.
~ root @ node1 # ceph orch host rm node3 Removed host 'node3' ~ root @ node1 # ceph -s cluster: id: 8c6c13ea-e866-11ea-ba42-c5e739a1a644 health: HEALTH_WARN 6 stray daemons(s) not managed by cephadm 1 stray host(s) with 6 daemon(s) not managed by cephadm services: mon: 2 daemons, quorum node1,node3 (age 0.289729s) mgr: node2.uodwru(active, since 2d), standbys: node3.qpwept, node1.ffixan mds: 4 fs {apps:0=apps.node2.xdhmef=up:active,backups:0=backups.node2.mmcltv=up:active,k8s:0=k8s.node3.gtoqiu=up:active,media:0=media.node2.akknxy=up:active} 5 up:standby osd: 14 osds: 14 up (since 2d), 14 in (since 7d) data: pools: 10 pools, 769 pgs objects: 2.13M objects, 8.0 TiB usage: 16 TiB used, 14 TiB / 30 TiB avail pgs: 769 active+clean io: client: 0 B/s rd, 1.4 MiB/s wr, 72 op/s rd, 96 op/s wr
Updated by Sebastian Wagner over 3 years ago
- Related to Feature #44414: bubble up errors during 'apply' phase to 'cluster warnings' added
Updated by Sebastian Wagner over 3 years ago
At this point in the development,
ceph orch host rm ...
does not remove any daemons. This is simply not implemented. We plan to implement a host drain functionality, but this is not done yet.
That said, there seems to be an issue with the MONs right now. Mind taking a look why mon2 is getting removed from the quorum?
Updated by Juan Miguel Olmo Martínez about 3 years ago
- Related to Feature #47782: ceph orch host rm <host> is not stopping the services deployed in the respective removed hosts added
Updated by Juan Miguel Olmo Martínez about 3 years ago
- Related to Bug #43838: cephadm: Forcefully Remove Services (unresponsive hosts) added
Updated by Sebastian Wagner about 3 years ago
- Status changed from New to Can't reproduce