Project

General

Profile

Actions

Bug #45327

closed

cephadm: Orch daemon add is not idempotent

Added by Sebastian Wagner almost 4 years ago. Updated almost 2 years ago.

Status:
Resolved
Priority:
High
Category:
cephadm
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
pacific, octopus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

audit [DBG] from='client.14180 v1:172.21.15.139:0/1866757471' entity='client.admin' cmd=[{"prefix": "orch daemon add", "daemon_type": "mon", "placement": "smithi139:[v1:172.21.15.139:6789]=c", "target": ["mon-mgr", ""]}]: dispatch
audit [INF] from='mgr.14141 v1:172.21.15.131:0/1812629321' entity='mgr.y' cmd=[{"prefix": "auth get", "entity": "mon."}]: dispatch
audit [DBG] from='mgr.14141 v1:172.21.15.131:0/1812629321' entity='mgr.y' cmd=[{"prefix": "config generate-minimal-conf"}]: dispatch
cephadm [INF] Deploying daemon mon.c on smithi139
audit [DBG] from='mgr.14141 v1:172.21.15.131:0/1812629321' entity='mgr.y' cmd=[{"prefix": "config get", "who": "mon.c", "key": "container_image"}]: dispatch
audit [DBG] from='client.14180 v1:172.21.15.139:0/1866757471' entity='client.admin' cmd=[{"prefix": "orch daemon add", "daemon_type": "mon", "placement": "smithi139:[v1:172.21.15.139:6789]=c", "target": ["mon-mgr", ""]}]: dispatch

Remote method threw exception: Traceback (most recent call last):
Error EINVAL: Traceback (most recent call last):
  File "/usr/share/ceph/mgr/mgr_module.py", line 1153, in _handle_command
    return self.handle_command(inbuf, cmd)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 110, in handle_command
    return dispatch[cmd['prefix']].call(self, cmd, inbuf)
  File "/usr/share/ceph/mgr/mgr_module.py", line 308, in call
    return self.func(mgr, **kwargs)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 72, in <lambda>
    wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 63, in wrapper
    return func(*args, **kwargs)
  File "/usr/share/ceph/mgr/orchestrator/module.py", line 658, in _daemon_add_misc
    completion = self.add_mon(spec)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 1542, in inner
    completion = self._oremote(method_name, args, kwargs)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 1614, in _oremote
    return mgr.remote(o, meth, *args, **kwargs)
  File "/usr/share/ceph/mgr/mgr_module.py", line 1515, in remote
    args, kwargs)
RuntimeError: Remote method threw exception: Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/module.py", line 2740, in add_mon
    return self._add_daemon('mon', spec, self._create_mon)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 2652, in _add_daemon
    create_func, config_func)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 2667, in _create_daemons
    spec.service_id, name)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1228, in get_unique_name
    raise orchestrator.OrchestratorValidationError('name %s already in use', forcename)
orchestrator._interface.OrchestratorValidationError: ('name %s already in use', 'c')

http://pulpito.ceph.com/teuthology-2020-04-27_03:30:02-rados-octopus-distro-basic-smithi/4988571/


Related issues 5 (0 open5 closed)

Related to Orchestrator - Bug #44824: cephadm: adding osd device is not idempotentResolved

Actions
Related to Orchestrator - Bug #52742: octopus: orchestrator._interface.OrchestratorValidationError: name mon.c already in useWon't Fix

Actions
Has duplicate Orchestrator - Bug #45296: cephadm: daemon add mon failure: orchestrator._interface.OrchestratorValidationError: ('name %s already in use', 'b')Duplicate

Actions
Has duplicate Orchestrator - Bug #46854: Error ENOENT: name mon.smithi074 already in use seen on octopusDuplicate

Actions
Has duplicate Orchestrator - Bug #47709: orchestrator._interface.OrchestratorValidationError: name mon.c already in useDuplicate

Actions
Actions #1

Updated by Sebastian Wagner almost 4 years ago

  • Subject changed from Orch daemon add is not idempotent to cephadm: Orch daemon add is not idempotent
Actions #2

Updated by Sebastian Wagner almost 4 years ago

  • Priority changed from Normal to High
Actions #3

Updated by Sebastian Wagner almost 4 years ago

  • Has duplicate Bug #45296: cephadm: daemon add mon failure: orchestrator._interface.OrchestratorValidationError: ('name %s already in use', 'b') added
Actions #4

Updated by Sebastian Wagner almost 4 years ago

Ok,

ceph orch apply ...

is already idempotent.

On the other hand,

ceph orch daemon add ...

is supposed to be raw and direct without any magic behind it. I for one would prefer to have Teuthology invoke `apply` instead of `daemon add`.

Actions #5

Updated by Joshua Schmid almost 4 years ago

I'd argue that we should only maintain one command for service creation and I would recommend going with for `apply`.

We should adapt the teuthology codepath to use/support `apply` as well.

Actions #6

Updated by Neha Ojha almost 4 years ago

  • Backport set to octopus

/a/yuriw-2020-05-02_20:02:46-rados-wip-yuri6-testing-2020-04-30-2259-octopus-distro-basic-smithi/5016611/

Actions #7

Updated by Sebastian Wagner almost 4 years ago

  • Source set to Q/A
Actions #8

Updated by Sebastian Wagner almost 4 years ago

  • Related to Bug #44824: cephadm: adding osd device is not idempotent added
Actions #9

Updated by Sebastian Wagner almost 4 years ago

`daemon add` is too low level. If we want commands to be idempotent, we have to remove calling them cephadm.py

Actions #10

Updated by Sebastian Wagner over 3 years ago

https://pulpito.ceph.com/swagner-2020-08-03_12:11:23-rados:cephadm-wip-swagner-testing-2020-08-03-1038-distro-basic-smithi/5284050/

> sudo /home/ubuntu/cephtest/cephadm --image quay.ceph.io/ceph-ci/ceph:4502cc1b194810b439c691dc8aeabee06dcc5c6b shell -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring --fsid 2523a68e-d589-11ea-a070-001a4aab830c -- ceph orch daemon add mon smithi193:172.21.15.193=smithi193
cephadm 2020-08-03T13:02:24.253615+0000 mgr.smithi089.ppezhp (mgr.14169) 63 : cephadm [INF] Deploying daemon mgr.smithi193.xqzdep on smithi193
cluster 2020-08-03T13:02:24.597060+0000 mgr.smithi089.ppezhp (mgr.14169) 64 : cluster [DBG] pgmap v34: 1 pgs: 1 unknown; 0 B data, 0 B used, 0 B / 0 B avail
audit 2020-08-03T13:02:25.255138+0000 mon.smithi089 (mon.0) 278 : audit [INF] from='client.? 172.21.15.89:0/2356875247' entity='client.admin' cmd='[{"prefix": "osd crush tunables", "profile": "default"}]': finished
cluster 2020-08-03T13:02:25.255187+0000 mon.smithi089 (mon.0) 279 : cluster [DBG] osdmap e7: 0 total, 0 up, 0 in
audit 2020-08-03T13:02:25.817810+0000 mon.smithi089 (mon.0) 282 : audit [DBG] from='mgr.14169 172.21.15.89:0/385454869' entity='mgr.smithi089.ppezhp' cmd=[{"prefix": "config get", "who": "mon", "key": "public_network"}]: dispatch
audit 2020-08-03T13:02:25.818730+0000 mon.smithi089 (mon.0) 283 : audit [INF] from='mgr.14169 172.21.15.89:0/385454869' entity='mgr.smithi089.ppezhp' cmd=[{"prefix": "auth get", "entity": "mon."}]: dispatch
audit 2020-08-03T13:02:25.819219+0000 mon.smithi089 (mon.0) 284 : audit [DBG] from='mgr.14169 172.21.15.89:0/385454869' entity='mgr.smithi089.ppezhp' cmd=[{"prefix": "config get", "who": "mon", "key": "public_network"}]: dispatch
audit 2020-08-03T13:02:25.819717+0000 mon.smithi089 (mon.0) 285 : audit [DBG] from='mgr.14169 172.21.15.89:0/385454869' entity='mgr.smithi089.ppezhp' cmd=[{"prefix": "config generate-minimal-conf"}]: dispatch
audit 2020-08-03T13:02:25.820440+0000 mon.smithi089 (mon.0) 286 : audit [DBG] from='mgr.14169 172.21.15.89:0/385454869' entity='mgr.smithi089.ppezhp' cmd=[{"prefix": "config get", "who": "mon.smithi193", "key": "container_image"}]: dispatch
cephadm 2020-08-03T13:02:25.820087+0000 mgr.smithi089.ppezhp (mgr.14169) 65 : cephadm [INF] Deploying daemon mon.smithi193 on smithi193
cluster 2020-08-03T13:02:26.597340+0000 mgr.smithi089.ppezhp (mgr.14169) 66 : cluster [DBG] pgmap v36: 1 pgs: 1 unknown; 0 B data, 0 B used, 0 B / 0 B avail
Error EINVAL: name mon.smithi193 already in use
Actions #11

Updated by Sebastian Wagner over 3 years ago

  • Has duplicate Bug #46854: Error ENOENT: name mon.smithi074 already in use seen on octopus added
Actions #12

Updated by Sebastian Wagner over 3 years ago

  • Has duplicate Bug #47709: orchestrator._interface.OrchestratorValidationError: name mon.c already in use added
Actions #13

Updated by Sebastian Wagner over 3 years ago

ok:

We must not call daemon add in Teuthology. It's just too low level and not meant to be idempotent.

Would be great, if someone want to take this.

Actions #14

Updated by Juan Miguel Olmo Martínez over 3 years ago

  • Assignee set to Juan Miguel Olmo Martínez
Actions #15

Updated by Deepika Upadhyay about 3 years ago

te_daemons
2021-02-28T12:07:16.867 INFO:journalctl@ceph.mgr.y.smithi014.stdout:Feb 28 12:07:16 smithi014 bash[11619]:     forcename=name)
2021-02-28T12:07:16.867 INFO:journalctl@ceph.mgr.y.smithi014.stdout:Feb 28 12:07:16 smithi014 bash[11619]:   File "/usr/share/ceph/mgr/cephadm/module.py", line 535, in get_unique_name
2021-02-28T12:07:16.867 INFO:journalctl@ceph.mgr.y.smithi014.stdout:Feb 28 12:07:16 smithi014 bash[11619]:     f'name {daemon_type}.{forcename} already in use')
2021-02-28T12:07:16.867 INFO:journalctl@ceph.mgr.y.smithi014.stdout:Feb 28 12:07:16 smithi014 bash[11619]: orchestrator._interface.OrchestratorValidationError: name mon.c already in use
2021-02-28T12:07:16.868 INFO:journalctl@ceph.mgr.y.smithi014.stdout:Feb 28 12:07:16 smithi014 bash[11619]: debug 2021-02-28T12:07:16.470+0000 7f10ff1a3700 -1 mgr.server reply reply (22) Invalid argument name mon.c already in use
2021-02-28T12:07:17.178 INFO:journalctl@ceph.mon.b.smithi063.stdout:Feb 28 12:07:16 smithi063 bash[11262]: audit 2021-02-28T12:07:16.458970+0000 mon.a (mon.0) 172 : audit [DBG] from='mgr.14138 v1:172.21.15.14:0/505964003' entity='mgr.y' cmd=[{"prefix": "config get", "who": "mon", "key": "container_image"}]: dispatch
2021-02-28T12:07:17.178 INFO:journalctl@ceph.mon.b.smithi063.stdout:Feb 28 12:07:16 smithi063 bash[11262]: audit 2021-02-28T12:07:16.461861+0000 mon.a (mon.0) 173 : audit [INF] from='mgr.14138 v1:172.21.15.14:0/505964003' entity='mgr.y'
2021-02-28T12:07:17.363 INFO:journalctl@ceph.mon.a.smithi014.stdout:Feb 28 12:07:16 smithi014 bash[11359]: audit 2021-02-28T12:07:16.458970+0000 mon.a (mon.0) 172 : audit [DBG] from='mgr.14138 v1:172.21.15.14:0/505964003' entity='mgr.y' cmd=[{"prefix": "config get", "who": "mon", "key": "container_image"}]: dispatch
2021-02-28T12:07:17.364 INFO:journalctl@ceph.mon.a.smithi014.stdout:Feb 28 12:07:16 smithi014 bash[11359]: audit 2021-02-28T12:07:16.461861+0000 mon.a (mon.0) 173 : audit [INF] from='mgr.14138 v1:172.21.15.14:0/505964003' entity='mgr.y'
2021-02-28T12:07:17.492 DEBUG:teuthology.orchestra.run:got remote process result: 22
2021-02-28T12:07:17.493 ERROR:teuthology.contextutil:Saw exception from nested tasks
Traceback (most recent call last):

/ceph/teuthology-archive/teuthology-2021-02-24_03:30:04-rados-octopus-distro-basic-smithi/5910301/teuthology.log"
Actions #16

Updated by Sage Weil about 3 years ago

  • Status changed from New to Fix Under Review
  • Backport changed from octopus to pacific
  • Pull request ID set to 40314

we can't rely on 'orch apply mon ...' in octopus, at least not with the current scheduler, since it can't handle multiple daemons of the same type on the same host.

Actions #17

Updated by Deepika Upadhyay about 3 years ago

  description: rados/thrash-old-clients/{0-size-min-size-overrides/3-size-2-min-size
    1-install/luminous backoff/peering_and_degraded ceph clusters/{openstack three-plus-one}
    d-balancer/off distro$/{ubuntu_18.04} msgr-failures/osd-delay rados thrashers/default
    thrashosds-health workloads/test_rbd_api}

2021-04-01T16:44:18.342 INFO:journalctl@ceph.mgr.y.smithi033.stdout:Apr 01 16:44:17 smithi033 bash[12352]: Traceback (most recent call last):
2021-04-01T16:44:18.343 INFO:journalctl@ceph.mgr.y.smithi033.stdout:Apr 01 16:44:17 smithi033 bash[12352]:   File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 294, in _finalize
2021-04-01T16:44:18.343 INFO:journalctl@ceph.mgr.y.smithi033.stdout:Apr 01 16:44:17 smithi033 bash[12352]:     next_result = self._on_complete(self._value)
2021-04-01T16:44:18.343 INFO:journalctl@ceph.mgr.y.smithi033.stdout:Apr 01 16:44:17 smithi033 bash[12352]:   File "/usr/share/ceph/mgr/cephadm/module.py", line 107, in <lambda>
2021-04-01T16:44:18.344 INFO:journalctl@ceph.mgr.y.smithi033.stdout:Apr 01 16:44:17 smithi033 bash[12352]:     return CephadmCompletion(on_complete=lambda _: f(*args, **kwargs))
2021-04-01T16:44:18.344 INFO:journalctl@ceph.mgr.y.smithi033.stdout:Apr 01 16:44:17 smithi033 bash[12352]:   File "/usr/share/ceph/mgr/cephadm/module.py", line 1917, in add_mon
2021-04-01T16:44:18.344 INFO:journalctl@ceph.mgr.y.smithi033.stdout:Apr 01 16:44:17 smithi033 bash[12352]:     return self._add_daemon('mon', spec, self.mon_service.prepare_create)
2021-04-01T16:44:18.344 INFO:journalctl@ceph.mgr.y.smithi033.stdout:Apr 01 16:44:17 smithi033 bash[12352]:   File "/usr/share/ceph/mgr/cephadm/module.py", line 1860, in _add_daemon
2021-04-01T16:44:18.344 INFO:journalctl@ceph.mgr.y.smithi033.stdout:Apr 01 16:44:17 smithi033 bash[12352]:     create_func, config_func)
2021-04-01T16:44:18.345 INFO:journalctl@ceph.mgr.y.smithi033.stdout:Apr 01 16:44:17 smithi033 bash[12352]:   File "/usr/share/ceph/mgr/cephadm/module.py", line 1880, in _create_daemons
2021-04-01T16:44:18.345 INFO:journalctl@ceph.mgr.y.smithi033.stdout:Apr 01 16:44:17 smithi033 bash[12352]:     forcename=name)
2021-04-01T16:44:18.345 INFO:journalctl@ceph.mgr.y.smithi033.stdout:Apr 01 16:44:17 smithi033 bash[12352]:   File "/usr/share/ceph/mgr/cephadm/module.py", line 535, in get_unique_name
2021-04-01T16:44:18.345 INFO:journalctl@ceph.mgr.y.smithi033.stdout:Apr 01 16:44:17 smithi033 bash[12352]:     f'name {daemon_type}.{forcename} already in use')
2021-04-01T16:44:18.346 INFO:journalctl@ceph.mgr.y.smithi033.stdout:Apr 01 16:44:17 smithi033 bash[12352]: orchestrator._interface.OrchestratorValidationError: name mon.b already in use

/ceph/teuthology-archive/yuriw-2021-04-01_15:23:17-rados-wip-yuri-testing-2021-03-31-1516-octopus-distro-basic-smithi/6015105/teuthology.log
Actions #18

Updated by Sebastian Wagner about 3 years ago

  • Status changed from Fix Under Review to Closed
Actions #19

Updated by Sebastian Wagner over 2 years ago

  • Related to Bug #52742: octopus: orchestrator._interface.OrchestratorValidationError: name mon.c already in use added
Actions #20

Updated by Laura Flores almost 2 years ago

  • Status changed from Closed to Pending Backport
  • Backport changed from pacific to pacific, octopus
Actions #21

Updated by Laura Flores almost 2 years ago

/a/yuriw-2022-04-26_20:58:55-rados-wip-yuri2-testing-2022-04-26-1132-octopus-distro-default-smithi/6807469

Actions #22

Updated by Redouane Kachach Elhichou almost 2 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF