Bug #48939: Orchestrator removes mon daemon from wrong host when removing host from cluster - Orchestrator - Ceph

Actions

Copy link

Bug #48939

closed

Orchestrator removes mon daemon from wrong host when removing host from cluster

Added by Daniël Vos over 3 years ago. Updated about 3 years ago.

Status:

Can't reproduce

Priority:

Normal

Assignee:

Category:

orchestrator

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

2 - major

Reviewed:

Affected Versions:

Ceph - v15.2.8

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

It is as shocking as the subject describes.

To summarize:

Removing host mon1 nukes mon.mon3
Removing host mon3 nukes mon.mon1
Removing host mon2 nukes mon.mon3

Long description:

My cluster exists of 9 physical nodes, 3x mon/mgr and 6x storage. While debugging an issue with my mgr's that hang every 7-14 days I decided to completely remove a mon/mgr host from my cluster.

I am running ceph 15.2.8 on top of ubuntu 20.04 with docker 19.03.13. Using the orchestrator with cephadm. This cluster was created with either 15.2.1 or 15.2.2. Fresh, not an upgraded cluster

When I remove mon3 from my cluster with `ceph orch host rm mon3.ceph2.example.net` the mon daemon on mon1 gets removed, monitor quorum goes to 2/2 with mon2 and mon3.

Only the monitor daemon on mon1 gets removed. The other daemons running on mon1 are untouched. All daemons running on mon3 are also untouched.

ceph orch ls
NAME                  RUNNING  REFRESHED  AGE  PLACEMENT  IMAGE NAME                   IMAGE ID
crash                     9/9  78s ago    5w   *          docker.io/ceph/ceph:v15.2.8  5553b0cb212c
mgr                       3/3  78s ago    5w   mon*       docker.io/ceph/ceph:v15.2.8  5553b0cb212c
mon                       3/3  78s ago    5w   mon*       docker.io/ceph/ceph:v15.2.8  5553b0cb212c
osd.nvme_drive_group    30/30  75s ago    11w  node*      docker.io/ceph/ceph:v15.2.8  5553b0cb212c

root @ mon1.ceph2 # ceph orch host ls
HOST                    ADDR                    LABELS  STATUS
mon1.ceph2.example.net   mon1.ceph2.example.net
mon2.ceph2.example.net   mon2.ceph2.example.net
mon3.ceph2.example.net   mon3.ceph2.example.net
node1.ceph2.example.net  node1.ceph2.example.net
node2.ceph2.example.net  node2.ceph2.example.net
node3.ceph2.example.net  node3.ceph2.example.net
node4.ceph2.example.net  node4.ceph2.example.net
node5.ceph2.example.net  node5.ceph2.example.net
node6.ceph2.example.net  node6.ceph2.example.net

root @ mon3.ceph2 # ceph orch host rm mon3.ceph2.example.net
Removed host 'mon3.ceph2.example.net'

root @ mon1.ceph2 # ceph -w
  cluster:
    id:     d77f7c4a-d656-11ea-95cb-531234b0f844
    health: HEALTH_WARN
            2 stray daemons(s) not managed by cephadm
            1 stray host(s) with 2 daemon(s) not managed by cephadm

  services:
    mon: 2 daemons, quorum mon2,mon3 (age 2m)
    mgr: mon2.iqrtrf(active, since 35m), standbys: mon1.rruwfr, mon3.dpuxam
    osd: 180 osds: 180 up (since 2w), 180 in (since 5w)

  data:
    pools:   2 pools, 2049 pgs
    objects: 2.14M objects, 8.2 TiB
    usage:   24 TiB used, 31 TiB / 55 TiB avail
    pgs:     2049 active+clean

  io:
    client:   292 KiB/s rd, 6.7 MiB/s wr, 18 op/s rd, 597 op/s wr

Adding mon3 back to the cluster with `ceph orch host add mon3.ceph2.example.net mon3.ceph2.example.net`, it immediately spawns me a monitor daemon on mon1 but it is not added to the quorum. I have to manually do this with `ceph mon add mon1 2001:db8::xxx`

I have repeated the process 3 times with mon3, it happened 3 times.

When I `ceph orch host rm mon1.ceph2.example.net`, the same thing happens but the other way around. The monitor daemon on mon3 gets nuked, quorum goes to mon1/mon2. After adding the host to the orchestrator I have to manually add the monitor to get my 3/3 quorum back.

When I remove mon2 from the cluster, the monitor on mon3 gets nuked as well.

~
root @ mon3.ceph2 # ceph orch host rm mon2.ceph2.example.net
Removed host 'mon2.ceph2.example.net'

~
root @ mon3.ceph2 # ceph -w
  cluster:
    id:     d77f7c4a-d656-11ea-95cb-531234b0f844
    health: HEALTH_WARN
            2 stray daemons(s) not managed by cephadm
            1 stray host(s) with 2 daemon(s) not managed by cephadm

  services:
    mon: 2 daemons, quorum mon2,mon1 (age 0.120675s)
    mgr: mon2.iqrtrf(active, since 17m), standbys: mon3.dpuxam, mon1.rruwfr
    osd: 180 osds: 180 up (since 2w), 180 in (since 5w)

  data:
    pools:   2 pools, 2049 pgs
    objects: 2.14M objects, 8.2 TiB
    usage:   24 TiB used, 31 TiB / 55 TiB avail
    pgs:     2049 active+clean

  io:
    client:   18 MiB/s rd, 21 MiB/s wr, 340 op/s rd, 646 op/s wr

2021-01-20T16:54:07.487722+0100 mon.mon1 [INF] mon.mon1 calling monitor election
2021-01-20T16:54:09.473600+0100 mon.mon2 [INF] mon.mon2 calling monitor election
2021-01-20T16:54:09.497733+0100 mon.mon2 [INF] mon.mon2 is new leader, mons mon2,mon1 in quorum (ranks 0,1)
2021-01-20T16:54:09.518510+0100 mon.mon2 [INF] overall HEALTH_OK
2021-01-20T16:54:09.558924+0100 mon.mon2 [WRN] Health check failed: 2 stray daemons(s) not managed by cephadm (CEPHADM_STRAY_DAEMON)
2021-01-20T16:54:09.558962+0100 mon.mon2 [WRN] Health check failed: 1 stray host(s) with 2 daemon(s) not managed by cephadm (CEPHADM_STRAY_HOST)

## mon3
~
root @ mon3.ceph2 # docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
4c702fb9d8c8        ceph/ceph:v15.2.8   "/usr/bin/ceph-mgr -…"   39 minutes ago      Up 39 minutes                           ceph-d77f7c4a-d656-11ea-95cb-531234b0f844-mgr.mon3.dpuxam
ffb030ce25a7        ceph/ceph:v15.2.8   "/usr/bin/ceph-crash…"   2 weeks ago         Up 2 weeks                              ceph-d77f7c4a-d656-11ea-95cb-531234b0f844-crash.mon3

## mon2
~
root @ mon2.ceph2 # docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
874af2c2a8f9        ceph/ceph:v15.2.8   "/usr/bin/ceph-mon -…"   2 hours ago         Up 2 hours                              ceph-d77f7c4a-d656-11ea-95cb-531234b0f844-mon.mon2
1e09357dbfed        ceph/ceph:v15.2.8   "/usr/bin/ceph-crash…"   2 hours ago         Up 2 hours                              ceph-d77f7c4a-d656-11ea-95cb-531234b0f844-crash.mon2
5baf2d4860d0        ceph/ceph:v15.2.8   "/usr/bin/ceph-mgr -…"   2 hours ago         Up 2 hours                              ceph-d77f7c4a-d656-11ea-95cb-531234b0f844-mgr.mon2.iqrtrf

After adding mon2 back to the orchestrator, a monitor daemon is spawned on mon3 but it is not added to the quorum. I have to manually do this.

~
root @ mon3.ceph2 # ceph mon add mon3 2001:db8:0:11d::fd
adding mon.mon3 at [v2:[2001:db8:0:11d::fd]:3300/0,v1:[2001:db8:0:11d::fd]:6789/0]

~
root @ mon3.ceph2 # ceph -w
  cluster:
    id:     d77f7c4a-d656-11ea-95cb-531234b0f844
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum mon2,mon1,mon3 (age 0.276922s)
    mgr: mon3.dpuxam(active, since 5m), standbys: mon1.rruwfr, mon2.iqrtrf
    osd: 180 osds: 180 up (since 2w), 180 in (since 5w)

  data:
    pools:   2 pools, 2049 pgs
    objects: 2.14M objects, 8.2 TiB
    usage:   24 TiB used, 31 TiB / 55 TiB avail
    pgs:     2049 active+clean

  io:
    client:   41 KiB/s rd, 12 MiB/s wr, 2 op/s rd, 542 op/s wr

2021-01-20T17:01:39.284854+0100 mon.mon1 [INF] mon.mon1 calling monitor election
2021-01-20T17:01:39.309379+0100 mon.mon3 [INF] mon.mon3 calling monitor election
2021-01-20T17:01:44.315558+0100 mon.mon3 [INF] mon.mon3 calling monitor election
2021-01-20T17:01:44.319832+0100 mon.mon1 [INF] mon.mon1 calling monitor election
2021-01-20T17:01:44.327461+0100 mon.mon2 [WRN] Health check failed: 1/3 mons down, quorum mon2,mon1 (MON_DOWN)
2021-01-20T17:01:44.329326+0100 mon.mon2 [INF] overall HEALTH_OK
2021-01-20T17:01:44.329347+0100 mon.mon2 [INF] mon.mon2 calling monitor election
2021-01-20T17:01:44.355469+0100 mon.mon2 [INF] mon.mon2 is new leader, mons mon2,mon1,mon3 in quorum (ranks 0,1,2)
2021-01-20T17:01:44.375399+0100 mon.mon2 [INF] Health check cleared: MON_DOWN (was: 1/3 mons down, quorum mon2,mon1)
2021-01-20T17:01:44.375422+0100 mon.mon2 [INF] Cluster is now healthy
2021-01-20T17:01:44.383761+0100 mon.mon2 [INF] overall HEALTH_OK

This cluster runs critical production so I am not too keen on poking it. My debug level is set to 5 for my mgr issues but the mgr logs don't show me anything special

I have checked the hostnames, ip addresses and dns records 10x to make sure there is no error there. But since removing both host mon1 and mon2 causes the mon.mon3 to be deleted I don't think dns is the issue here.

If any further information is required I'll be happy to supply you with it.

Related issues 3 (0 open — 3 closed)

Actions

Copy link

Updated by Daniël Vos over 3 years ago

Quick update to confirm this behavior. I have been able to reproduce this on my personal homelab ceph cluster, also running 15.2.8. Removing `node3` from my homelab cluster nukes the mon.node2 daemon and leaves me with a 2/2 quorum existing of node1 and node3.

~
root @ node1 # ceph orch host rm node3
Removed host 'node3'

~
root @ node1 # ceph -s
  cluster:
    id:     8c6c13ea-e866-11ea-ba42-c5e739a1a644
    health: HEALTH_WARN
            6 stray daemons(s) not managed by cephadm
            1 stray host(s) with 6 daemon(s) not managed by cephadm

  services:
    mon: 2 daemons, quorum node1,node3 (age 0.289729s)
    mgr: node2.uodwru(active, since 2d), standbys: node3.qpwept, node1.ffixan
    mds: 4 fs {apps:0=apps.node2.xdhmef=up:active,backups:0=backups.node2.mmcltv=up:active,k8s:0=k8s.node3.gtoqiu=up:active,media:0=media.node2.akknxy=up:active} 5 up:standby
    osd: 14 osds: 14 up (since 2d), 14 in (since 7d)

  data:
    pools:   10 pools, 769 pgs
    objects: 2.13M objects, 8.0 TiB
    usage:   16 TiB used, 14 TiB / 30 TiB avail
    pgs:     769 active+clean

  io:
    client:   0 B/s rd, 1.4 MiB/s wr, 72 op/s rd, 96 op/s wr

Actions

Copy link