Bug #52454
mgr/cephadm: orch maintenance enter command failed
Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
cephadm
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Description
I have a cluster with 3 hosts and with 3 mons. when I entered the orch maintenance enter command on the host with only a mon it failed on this error.
Aug 30 11:13:54 ceph-node-00.cephlab.com ceph-mgr[13294]: log_channel(audit) log [DBG] : from='client.14199 -' entity='client.admin' cmd=[{"prefix": "orch host maintenance enter", "hostname": "ceph-node-02.cephlab.com", "target": ["mon-mgr", ""]}]: dispatch Aug 30 11:13:54 ceph-node-00.cephlab.com ceph-mgr[13294]: [cephadm INFO asyncio] poll took 35226.431 ms: 1 events Aug 30 11:13:54 ceph-node-00.cephlab.com ceph-mgr[13294]: log_channel(cephadm) log [INF] : poll took 35226.431 ms: 1 events Aug 30 11:13:54 ceph-node-00.cephlab.com ceph-mgr[13294]: log_channel(cluster) log [DBG] : pgmap v121: 0 pgs: ; 0 B data, 0 B used, 0 B / 0 B avail Aug 30 11:13:54 ceph-node-00.cephlab.com conmon[13289]: 2021-08-30T11:13:54.561+0000 7eff25323700 -1 mgr.server reply reply (22) Invalid argument Failed to place ceph-node-02.cephlab.com into maintenance for cluster 261bbac6-0982-11ec-9a2b-5254007ec620 Aug 30 11:13:54 ceph-node-00.cephlab.com conmon[13289]: Aug 30 11:13:54 ceph-node-00.cephlab.com ceph-mgr[13294]: mgr.server reply reply (22) Invalid argument Failed to place ceph-node-02.cephlab.com into maintenance for cluster 261bbac6-0982-11ec-9a2b-5254007ec620
Similar error happened while I was exiting a host from maintenance (I added a host in maintenance)
Output of maintenance exit command
Sep 06 15:39:22 ceph-node-00.cephlab.com ceph-mgr[13555]: log_channel(audit) log [DBG] : from='client.14178 -' entity='client.admin' cmd=[{"prefix": "orch host maintenance exit", "hostname": "ceph-node-01.cephlab.com", "target": ["mon-mgr", ""]}]: dispatch Sep 06 15:39:22 ceph-node-00.cephlab.com ceph-mgr[13555]: [cephadm INFO asyncio] poll took 20869.058 ms: 1 events Sep 06 15:39:22 ceph-node-00.cephlab.com ceph-mgr[13555]: log_channel(cephadm) log [INF] : poll took 20869.058 ms: 1 events Sep 06 15:39:22 ceph-node-00.cephlab.com conmon[13550]: 2021-09-06T15:39:22.765+0000 7fc952d5f700 -1 mgr.server reply reply (22) Invalid argument Failed to exit maintenance state for host ceph-node-01.cephlab.com, cluster 3e76f40a-0f27-11ec-b612-525400d48f70 Sep 06 15:39:22 ceph-node-00.cephlab.com conmon[13550]: Sep 06 15:39:22 ceph-node-00.cephlab.com ceph-mgr[13555]: mgr.server reply reply (22) Invalid argument Failed to exit maintenance state for host ceph-node-01.cephlab.com, cluster 3e76f40a-0f27-11ec-b612-525400d48f70 Sep 06 15:39:23 ceph-node-00.cephlab.com ceph-mgr[13555]: log_channel(cluster) log [DBG] : pgmap v204: 0 pgs: ; 0 B data, 0 B used, 0 B / 0 B avail
From dashboard
Sep 06 15:37:04 ceph-node-00.cephlab.com ceph-mgr[13555]: [dashboard ERROR exception] Dashboard Exception Traceback (most recent call last): File "/usr/share/ceph/mgr/dashboard/services/exception.py", line 89, in handle_orchestrator_error yield File "/lib64/python3.6/contextlib.py", line 52, in inner return func(*args, **kwds) File "/usr/share/ceph/mgr/dashboard/controllers/host.py", line 431, in set orch.hosts.exit_maintenance(hostname) File "/usr/share/ceph/mgr/dashboard/services/orchestrator.py", line 38, in inner raise_if_exception(completion) File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 224, in raise_if_exception raise e orchestrator._interface.OrchestratorError: Failed to exit maintenance state for host ceph-node-01.cephlab.com, cluster 3e76f40a-0f27-11ec-b612-525400d48f70 During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/share/ceph/mgr/dashboard/services/exception.py", line 46, in dashboard_exception_handler return handler(*args, **kwargs) File "/lib/python3.6/site-packages/cherrypy/_cpdispatch.py", line 54, in __call__ return self.callable(*self.args, **self.kwargs) File "/usr/share/ceph/mgr/dashboard/controllers/__init__.py", line 717, in inner ret = func(*args, **kwargs) File "/usr/share/ceph/mgr/dashboard/controllers/__init__.py", line 949, in wrapper return func(*vpath, **params) File "/usr/share/ceph/mgr/dashboard/controllers/orchestrator.py", line 33, in _inner return method(self, *args, **kwargs) File "/lib64/python3.6/contextlib.py", line 52, in inner return func(*args, **kwds) File "/lib64/python3.6/contextlib.py", line 99, in __exit__ self.gen.throw(type, value, traceback) File "/usr/share/ceph/mgr/dashboard/services/exception.py", line 91, in handle_orchestrator_error raise DashboardException(e, component=component) dashboard.exceptions.DashboardException: Failed to exit maintenance state for host ceph-node-01.cephlab.com, cluster 3e76f40a-0f27-11ec-b612-525400d48f70
The mon goes down and I've also noticed other services in that host going into error state. (not sure if its related).
History
#1 Updated by Nizamudeen A over 2 years ago
- Description updated (diff)
#2 Updated by Nizamudeen A over 2 years ago
- Description updated (diff)
#3 Updated by Nizamudeen A over 2 years ago
- Description updated (diff)
#4 Updated by Nizamudeen A over 2 years ago
- Description updated (diff)
#5 Updated by Adam King over 2 years ago
- Status changed from New to Resolved
- Pull request ID set to 43275