Project

General

Profile

Bug #48535

QA smoke test: cephadm is removing mgr.y

Added by Sebastian Wagner 3 months ago. Updated 2 months ago.

Status:
Resolved
Priority:
Urgent
Category:
cephadm
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

https://pulpito.ceph.com/yuriw-2020-12-08_16:18:10-rados-octopus-distro-basic-smithi/5693969

cephadm is properly deploying mgr.y:

Dec 08 18:58:47 smithi099 bash[10707]: audit 2020-12-08T18:58:47.142557+0000 mon.a (mon.0) 254 : audit [INF] from='mgr.14138 172.21.15.131:0/1846859103' entity='mgr.y' cmd='[{"prefix":"config-ke
y set","key":"mgr/cephadm/host.smithi131","val":"{\"daemons\": {\"mon.a\": {\"daemon_type\": \"mon\", \"daemon_id\": \"a\", \"hostname\": \"smithi131\", \"container_id\": \"5fdb0de44749\", \"container_image_id\": \"dae82b93a77958a9a6819f28e335d1a604c7e3cdecf8bba
5e0c820df2b53a64a\", \"container_image_name\": \"quay.ceph.io/ceph-ci/ceph:b1c1268b5c492c09ac25a8ffa21109a4387acffe\", \"version\": \"15.2.7-661-gb1c1268b\", \"status\": 1, \"status_desc\": \"running\", \"is_active\": false, \"last_refresh\": \"2020-12-08T18:58:
47.137207\", \"created\": \"2020-12-08T18:56:38.730085\", \"started\": \"2020-12-08T18:56:45.378062\"}, \"mgr.y\": {\"daemon_type\": \"mgr\", \"daemon_id\": \"y\", \"hostname\": \"smithi131\", \"container_id\": \"620f6dfea3b3\", \"container_image_id\": \"dae82b9
3a77958a9a6819f28e335d1a604c7e3cdecf8bba5e0c820df2b53a64a\", \"container_image_name\": \"quay.ceph.io/ceph-ci/ceph:b1c1268b5c492c09ac25a8ffa21109a4387acffe\", \"version\": \"15.2.7-661-gb1c1268b\", \"status\": 1, \"status_desc\": \"running\", \"is_active\": fals
e, \"last_refresh\": \"2020-12-08T18:58:47.137300\", \"created\": \"2020-12-08T18:56:47.849275\", \"started\": \"2020-12-08T18:56:47.941925\"}, \"mon.c\": {\"daemon_type\": \"mon\", \"daemon_id\": \"c\", \"hostname\": \"smithi131\", \"container_id\": \"8566d05a0
1c5\", \"container_image_id\": \"dae82b93a77958a9a6819f28e335d1a604c7e3cdecf8bba5e0c820df2b53a64a\", \"container_image_name\": \"quay.ceph.io/ceph-ci/ceph:b1c1268b5c492c09ac25a8ffa21109a4387acffe\", \"version\": \"15.2.7-661-gb1c1268b\", \"status\": 1, \"status_
desc\": \"running\", \"is_active\": false, \"last_refresh\": \"2020-12-08T18:58:47.137347\", \"created\": \"2020-12-08T18:58:16.069285\", \"started\": \"2020-12-08T18:58:16.157886\"}}, \"devices\": [...], \"osdspec_previews\": [], \"daemon_config_deps\": {\"mon.a\": {\"deps\": [], \"last_config\": \"2020-12-08T18:58:44.148173\"}, \"mgr.y\": {\"deps\": [], \"last_con
fig\": \"2020-12-08T18:58:36.178335\"}, \"mon.c\": {\"deps\": [], \"last_config\": \"2020-12-08T18:58:38.524511\"}}, \"last_daemon_update\": \"2020-12-08T18:58:47.137450\", \"last_device_update\": \"2020-12-08T18:57:36.298984\", \"networks\": {\"172.17.0.0/16\":
 [\"172.17.0.1\"], \"172.21.0.0/20\": [\"172.21.15.131\"], \"172.21.15.254\": [\"172.21.15.131\"], \"fe80::/64\": [\"fe80::ec4:7aff:fe88:72f9\"]}, \"last_host_check\": \"2020-12-08T18:57:14.445904\"}"}]': finished

But at some point, cephadm.py decides to remove it again:

Dec 08 18:58:41.703 INFO:teuthology.orchestra.run.smithi099:> sudo /home/ubuntu/cephtest/cephadm --image quay.ceph.io/ceph-ci/ceph:b1c1268b5c492c09ac25a8ffa21109a4387acffe shell -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring --fsid 07c2237e-3987-11eb-9811-001a4aab830c -- ceph orch apply mgr '2;smithi099=x'
Dec 08 18:58:47 smithi131 bash[10704]: cephadm 2020-12-08T18:58:47.157062+0000 mgr.y (mgr.14138) 62 : cephadm [INF] Deploying daemon mgr.x on smithi099
Dec 08 18:58:47 smithi099 bash[10707]: cephadm 2020-12-08T18:58:47.157062+0000 mgr.y (mgr.14138) 62 : cephadm [INF] Deploying daemon mgr.x on smithi099
Dec 08 18:58:49 smithi099 bash[10707]: cephadm 2020-12-08T18:58:49.238457+0000 mgr.y (mgr.14138) 64 : cephadm [INF] It is presumed safe to stop ['mgr.y']
Dec 08 18:58:49 smithi099 bash[10707]: cephadm 2020-12-08T18:58:49.238723+0000 mgr.y (mgr.14138) 65 : cephadm [INF] Removing daemon mgr.y from smithi131

thus the mgr is then missing:

2020-12-08T19:05:16.283 INFO:teuthology.orchestra.run.smithi131.stdout:NAME            RUNNING  REFRESHED  AGE  PLACEMENT                        IMAGE NAME                                                          IMAGE ID
2020-12-08T19:05:16.284 INFO:teuthology.orchestra.run.smithi131.stdout:alertmanager        1/1  89s ago    2m   smithi131=a;count:1              docker.io/prom/alertmanager:v0.20.0                                 0881eb8f169f
2020-12-08T19:05:16.284 INFO:teuthology.orchestra.run.smithi131.stdout:grafana             1/1  89s ago    2m   smithi099=a;count:1              docker.io/ceph/ceph-grafana:6.6.2                                   a0dce381714a
2020-12-08T19:05:16.284 INFO:teuthology.orchestra.run.smithi131.stdout:iscsi.iscsi         1/1  89s ago    2m   smithi099=iscsi.a;count:1        quay.ceph.io/ceph-ci/ceph:b1c1268b5c492c09ac25a8ffa21109a4387acffe  dae82b93a779
2020-12-08T19:05:16.284 INFO:teuthology.orchestra.run.smithi131.stdout:mgr                 1/2  89s ago    6m   smithi099=x;count:2              quay.ceph.io/ceph-ci/ceph:b1c1268b5c492c09ac25a8ffa21109a4387acffe  dae82b93a779
2020-12-08T19:05:16.285 INFO:teuthology.orchestra.run.smithi131.stdout:mon                 3/0  89s ago    -    <unmanaged>                      quay.ceph.io/ceph-ci/ceph:b1c1268b5c492c09ac25a8ffa21109a4387acffe  dae82b93a779
2020-12-08T19:05:16.285 INFO:teuthology.orchestra.run.smithi131.stdout:node-exporter       2/2  89s ago    2m   smithi131=a;smithi099=b;count:2  docker.io/prom/node-exporter:v0.18.1                                e5a616e4b9cf
2020-12-08T19:05:16.285 INFO:teuthology.orchestra.run.smithi131.stdout:osd.None            8/0  89s ago    -    <unmanaged>                      quay.ceph.io/ceph-ci/ceph:b1c1268b5c492c09ac25a8ffa21109a4387acffe  dae82b93a779
2020-12-08T19:05:16.285 INFO:teuthology.orchestra.run.smithi131.stdout:prometheus          1/1  89s ago    2m   smithi099=a;count:1              docker.io/prom/prometheus:v2.18.1                                   de242295e225
2020-12-08T19:05:16.286 INFO:teuthology.orchestra.run.smithi131.stdout:rgw.realm.zone      1/1  89s ago    2m   smithi131=realm.zone.a;count:1   quay.ceph.io/ceph-ci/ceph:b1c1268b5c492c09ac25a8ffa21109a4387acffe  dae82b93a779

2020-12-08T19:05:12.375 INFO:teuthology.orchestra.run.smithi131.stdout:NAME              HOST       STATUS          REFRESHED  AGE   VERSION               IMAGE NAME                                                          IMAGE ID      CONTAINER ID
2020-12-08T19:05:12.375 INFO:teuthology.orchestra.run.smithi131.stdout:alertmanager.a    smithi131  running (89s)   85s ago    2m    0.20.0                docker.io/prom/alertmanager:v0.20.0                                 0881eb8f169f  5cfd91f2dec4
2020-12-08T19:05:12.376 INFO:teuthology.orchestra.run.smithi131.stdout:grafana.a         smithi099  running (104s)  85s ago    104s  6.6.2                 docker.io/ceph/ceph-grafana:6.6.2                                   a0dce381714a  d22d8f54a540
2020-12-08T19:05:12.376 INFO:teuthology.orchestra.run.smithi131.stdout:iscsi.iscsi.a     smithi099  running (2m)    85s ago    2m    3.4                   quay.ceph.io/ceph-ci/ceph:b1c1268b5c492c09ac25a8ffa21109a4387acffe  dae82b93a779  ee4b2dbcfe42
2020-12-08T19:05:12.376 INFO:teuthology.orchestra.run.smithi131.stdout:mgr.x             smithi099  running (6m)    85s ago    6m    15.2.7-661-gb1c1268b  quay.ceph.io/ceph-ci/ceph:b1c1268b5c492c09ac25a8ffa21109a4387acffe  dae82b93a779  5b50fc3e28b6
2020-12-08T19:05:12.376 INFO:teuthology.orchestra.run.smithi131.stdout:mon.a             smithi131  running (8m)    85s ago    8m    15.2.7-661-gb1c1268b  quay.ceph.io/ceph-ci/ceph:b1c1268b5c492c09ac25a8ffa21109a4387acffe  dae82b93a779  5fdb0de44749
2020-12-08T19:05:12.376 INFO:teuthology.orchestra.run.smithi131.stdout:mon.b             smithi099  running (6m)    85s ago    6m    15.2.7-661-gb1c1268b  quay.ceph.io/ceph-ci/ceph:b1c1268b5c492c09ac25a8ffa21109a4387acffe  dae82b93a779  6d898409329d
2020-12-08T19:05:12.377 INFO:teuthology.orchestra.run.smithi131.stdout:mon.c             smithi131  running (6m)    85s ago    6m    15.2.7-661-gb1c1268b  quay.ceph.io/ceph-ci/ceph:b1c1268b5c492c09ac25a8ffa21109a4387acffe  dae82b93a779  8566d05a01c5
2020-12-08T19:05:12.377 INFO:teuthology.orchestra.run.smithi131.stdout:node-exporter.a   smithi131  running (2m)    85s ago    2m    0.18.1                docker.io/prom/node-exporter:v0.18.1                                e5a616e4b9cf  6c48edd25d55
2020-12-08T19:05:12.377 INFO:teuthology.orchestra.run.smithi131.stdout:node-exporter.b   smithi099  running (2m)    85s ago    2m    0.18.1                docker.io/prom/node-exporter:v0.18.1                                e5a616e4b9cf  e4321578ec02
2020-12-08T19:05:12.377 INFO:teuthology.orchestra.run.smithi131.stdout:osd.0             smithi131  running (5m)    85s ago    5m    15.2.7-661-gb1c1268b  quay.ceph.io/ceph-ci/ceph:b1c1268b5c492c09ac25a8ffa21109a4387acffe  dae82b93a779  013ba62b1a67
2020-12-08T19:05:12.377 INFO:teuthology.orchestra.run.smithi131.stdout:osd.1             smithi131  running (5m)    85s ago    5m    15.2.7-661-gb1c1268b  quay.ceph.io/ceph-ci/ceph:b1c1268b5c492c09ac25a8ffa21109a4387acffe  dae82b93a779  b3249b0b5044
2020-12-08T19:05:12.378 INFO:teuthology.orchestra.run.smithi131.stdout:osd.2             smithi131  running (4m)    85s ago    4m    15.2.7-661-gb1c1268b  quay.ceph.io/ceph-ci/ceph:b1c1268b5c492c09ac25a8ffa21109a4387acffe  dae82b93a779  9b79623d7e60
2020-12-08T19:05:12.378 INFO:teuthology.orchestra.run.smithi131.stdout:osd.3             smithi131  running (4m)    85s ago    4m    15.2.7-661-gb1c1268b  quay.ceph.io/ceph-ci/ceph:b1c1268b5c492c09ac25a8ffa21109a4387acffe  dae82b93a779  4626c1117138
2020-12-08T19:05:12.378 INFO:teuthology.orchestra.run.smithi131.stdout:osd.4             smithi099  running (4m)    85s ago    4m    15.2.7-661-gb1c1268b  quay.ceph.io/ceph-ci/ceph:b1c1268b5c492c09ac25a8ffa21109a4387acffe  dae82b93a779  4a1c1bbd5040
2020-12-08T19:05:12.378 INFO:teuthology.orchestra.run.smithi131.stdout:osd.5             smithi099  running (3m)    85s ago    3m    15.2.7-661-gb1c1268b  quay.ceph.io/ceph-ci/ceph:b1c1268b5c492c09ac25a8ffa21109a4387acffe  dae82b93a779  c3e1893f1cc6
2020-12-08T19:05:12.378 INFO:teuthology.orchestra.run.smithi131.stdout:osd.6             smithi099  running (3m)    85s ago    3m    15.2.7-661-gb1c1268b  quay.ceph.io/ceph-ci/ceph:b1c1268b5c492c09ac25a8ffa21109a4387acffe  dae82b93a779  292f1b5ea013
2020-12-08T19:05:12.379 INFO:teuthology.orchestra.run.smithi131.stdout:osd.7             smithi099  running (3m)    85s ago    3m    15.2.7-661-gb1c1268b  quay.ceph.io/ceph-ci/ceph:b1c1268b5c492c09ac25a8ffa21109a4387acffe  dae82b93a779  5a88aabbe925
2020-12-08T19:05:12.379 INFO:teuthology.orchestra.run.smithi131.stdout:prometheus.a      smithi099  running (95s)   85s ago    2m    2.18.1                docker.io/prom/prometheus:v2.18.1                                   de242295e225  45d6bf1407d5
2020-12-08T19:05:12.379 INFO:teuthology.orchestra.run.smithi131.stdout:rgw.realm.zone.a  smithi131  running (2m)    85s ago    2m    15.2.7-661-gb1c1268b  quay.ceph.io/ceph-ci/ceph:b1c1268b5c492c09ac25a8ffa21109a4387acffe  dae82b93a779  ae6e6aa0af5c

Related issues

Related to Orchestrator - Bug #48142: rados:cephadm/upgrade/mon_election tests are failing: CapAdd and privileged are mutually exclusive options Pending Backport

History

#1 Updated by Sebastian Wagner 3 months ago

  • Description updated (diff)

#3 Updated by Sebastian Wagner 3 months ago

  • Status changed from New to In Progress

#4 Updated by Sebastian Wagner 3 months ago

  • Subject changed from QA smoke test: cephadm is removeing mgr.x to QA smoke test: cephadm is removeing mgr.y

#5 Updated by Sebastian Wagner 3 months ago

18:57:09.278786: cephadm [INF] Generating ssh key...
18:57:14.077823: cephadm [INF] Added host smithi131
18:57:36.311731: cephadm [INF] Reconfiguring mon.a (unknown last config time)...
18:57:36.314241: cephadm [INF] Reconfiguring daemon mon.a on smithi131
18:57:40.426664: cephadm [INF] Reconfiguring mgr.y (unknown last config time)...
18:57:40.429838: cephadm [INF] Reconfiguring daemon mgr.y on smithi131
18:58:00.473875: cephadm [INF] Added host smithi099
18:58:12.432854: cephadm [INF] Deploying daemon mon.c on smithi131
18:58:26.182884: cephadm [INF] Deploying daemon mon.b on smithi099
18:58:26.182884: cephadm [INF] Deploying daemon mon.b on smithi099
18:58:30.325896: cephadm [INF] Reconfiguring mon.a (monmap changed)...
18:58:30.328930: cephadm [INF] Reconfiguring daemon mon.a on smithi131
18:58:30.325896: cephadm [INF] Reconfiguring mon.a (monmap changed)...
18:58:30.328930: cephadm [INF] Reconfiguring daemon mon.a on smithi131
18:58:30.325896: cephadm [INF] Reconfiguring mon.a (monmap changed)...
18:58:30.328930: cephadm [INF] Reconfiguring daemon mon.a on smithi131
18:58:36.178231: cephadm [INF] Reconfiguring mgr.y (monmap changed)...
18:58:36.181122: cephadm [INF] Reconfiguring daemon mgr.y on smithi131
18:58:36.178231: cephadm [INF] Reconfiguring mgr.y (monmap changed)...
18:58:36.181122: cephadm [INF] Reconfiguring daemon mgr.y on smithi131
18:58:36.178231: cephadm [INF] Reconfiguring mgr.y (monmap changed)...
18:58:36.181122: cephadm [INF] Reconfiguring daemon mgr.y on smithi131
18:58:38.524456: cephadm [INF] Reconfiguring mon.c (monmap changed)...
18:58:38.526903: cephadm [INF] Reconfiguring daemon mon.c on smithi131
18:58:38.524456: cephadm [INF] Reconfiguring mon.c (monmap changed)...
18:58:38.526903: cephadm [INF] Reconfiguring daemon mon.c on smithi131
18:58:38.524456: cephadm [INF] Reconfiguring mon.c (monmap changed)...
18:58:38.526903: cephadm [INF] Reconfiguring daemon mon.c on smithi131
18:58:41.197770: cephadm [INF] Reconfiguring mon.b (monmap changed)...
18:58:41.200988: cephadm [INF] Reconfiguring daemon mon.b on smithi099
18:58:41.197770: cephadm [INF] Reconfiguring mon.b (monmap changed)...
18:58:41.200988: cephadm [INF] Reconfiguring daemon mon.b on smithi099
18:58:41.197770: cephadm [INF] Reconfiguring mon.b (monmap changed)...
18:58:41.200988: cephadm [INF] Reconfiguring daemon mon.b on smithi099
18:58:44.148121: cephadm [INF] Reconfiguring mon.a (monmap changed)...
18:58:44.150396: cephadm [INF] Reconfiguring daemon mon.a on smithi131
18:58:44.148121: cephadm [INF] Reconfiguring mon.a (monmap changed)...
18:58:44.150396: cephadm [INF] Reconfiguring daemon mon.a on smithi131
18:58:44.148121: cephadm [INF] Reconfiguring mon.a (monmap changed)...
18:58:44.150396: cephadm [INF] Reconfiguring daemon mon.a on smithi131
18:58:44.875338: cephadm [INF] Saving service mgr spec with placement smithi099=x;count:2
18:58:44.875338: cephadm [INF] Saving service mgr spec with placement smithi099=x;count:2
18:58:44.875338: cephadm [INF] Saving service mgr spec with placement smithi099=x;count:2
18:58:47.157062: cephadm [INF] Deploying daemon mgr.x on smithi099
18:58:47.157062: cephadm [INF] Deploying daemon mgr.x on smithi099
18:58:47.157062: cephadm [INF] Deploying daemon mgr.x on smithi099
18:58:49.238457: cephadm [INF] It is presumed safe to stop ['mgr.y']
18:58:49.238723: cephadm [INF] Removing daemon mgr.y from smithi131
18:58:49.238457: cephadm [INF] It is presumed safe to stop ['mgr.y']
18:58:49.238723: cephadm [INF] Removing daemon mgr.y from smithi131
18:58:49.238457: cephadm [INF] It is presumed safe to stop ['mgr.y']
18:58:49.238723: cephadm [INF] Removing daemon mgr.y from smithi131

#6 Updated by Sebastian Wagner 3 months ago

  • Description updated (diff)

#7 Updated by Sebastian Wagner 3 months ago

  • Priority changed from Immediate to Urgent

#8 Updated by Sebastian Wagner 3 months ago

  • Subject changed from QA smoke test: cephadm is removeing mgr.y to QA smoke test: cephadm is removing mgr.y

#9 Updated by Sebastian Wagner 3 months ago

  • Related to Bug #48142: rados:cephadm/upgrade/mon_election tests are failing: CapAdd and privileged are mutually exclusive options added

#10 Updated by Sebastian Wagner 3 months ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 38530

#11 Updated by Sebastian Wagner 2 months ago

  • Status changed from Fix Under Review to Resolved
  • Pull request ID changed from 38530 to 38707

#12 Updated by Sebastian Wagner 2 months ago

  • Assignee changed from Sebastian Wagner to Yuri Weinstein

Also available in: Atom PDF