Bug #58921: Mgr crashing with dashboard module enabled in 16.2.9 - Orchestrator - Ceph

Actions

Copy link

Bug #58921

open

Mgr crashing with dashboard module enabled in 16.2.9

Added by Adrian Nicolae about 1 year ago. Updated about 1 year ago.

Status:

New

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

Community (user)

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v16.2.9

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Hi,

We noticed some issues with the orchestrator. We'added new hosts with new drives which aren't automatically detected by the orchestrator. Checking the mgr logs I noticed it was crashing when having the dashboard module enabled (maybe the path has an extra backslash in the code) :

Feb 01 10:01:08 ds-ceph01-madrid bash^2829574: Internal Server Error
Feb 01 10:01:08 ds-ceph01-madrid bash^2829574: Traceback (most recent call last):
Feb 01 10:01:08 ds-ceph01-madrid bash^2829574: File "/lib/python3.6/site-packages/cherrypy/lib/static.py", line 58, in serve_file
Feb 01 10:01:08 ds-ceph01-madrid bash^2829574: st = os.stat(path)
Feb 01 10:01:08 ds-ceph01-madrid bash^2829574: FileNotFoundError: [Errno 2] No such file or directory: '/usr/share/ceph/mgr/dashboard/frontend/dist/en-US/prometheus_receiver'
Feb 01 10:01:08 ds-ceph01-madrid bash^2829574: During handling of the above exception, another exception occurred:
Feb 01 10:01:08 ds-ceph01-madrid bash^2829574: Traceback (most recent call last):
Feb 01 10:01:08 ds-ceph01-madrid bash^2829574: File "/usr/share/ceph/mgr/dashboard/services/exception.py", line 47, in dashboard_exception_handler
Feb 01 10:01:08 ds-ceph01-madrid bash^2829574: return handler(*args, **kwargs)
Feb 01 10:01:08 ds-ceph01-madrid bash^2829574: File "/lib/python3.6/site-packages/cherrypy/_cpdispatch.py", line 54, in call
Feb 01 10:01:08 ds-ceph01-madrid bash^2829574: return self.callable(*self.args, **self.kwargs)
Feb 01 10:01:08 ds-ceph01-madrid bash^2829574: File "/usr/share/ceph/mgr/dashboard/controllers/home.py", line 135, in call
Feb 01 10:01:08 ds-ceph01-madrid bash^2829574: return serve_file(full_path)
Feb 01 10:01:08 ds-ceph01-madrid bash^2829574: File "/lib/python3.6/site-packages/cherrypy/lib/static.py", line 65, in serve_file
Feb 01 10:01:08 ds-ceph01-madrid bash^2829574: raise cherrypy.NotFound()
Feb 01 10:01:08 ds-ceph01-madrid bash^2829574: cherrypy._cperror.NotFound: (404, "The path '/prometheus_receiver' was not found.")

After disabling the dashboard module, the new drives were detected and the new osd containers (docker) were deployed.

However, I now noticed another orch issue even with the dashboard disabled :

- I have a failed drive (osd.92)

- the drive was marked as down and out, the rebalancing was fine

- I'm trying to purge the osd after the rebalancing  was completed in order to ask for a replacement with "ceph orch osd rm osd.92 --force".

- the purge does nothing :

ceph orch osd rm status
OSD  HOST    STATE    PGS  REPLACE  FORCE  ZAP    DRAIN STARTED AT
92   node10  started    0  False    True   False

- the osd daemons are not refreshed :

ceph orch ps --daemon_type osd --daemon_id 92
NAME    HOST    PORTS  STATUS  REFRESHED  AGE  MEM USE  MEM LIM  VERSION    IMAGE ID
osd.92  node10         error      3d ago   4w        -    4096M  &lt;unknown&gt;  &lt;unknown&gt;

- I don't have any other errors in the mgr logs even with debug 20 activated

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » Orchestrator

Custom queries

Bug #58921

Mgr crashing with dashboard module enabled in 16.2.9

Updated by Ernesto Puerta about 1 year ago

Updated by Ernesto Puerta about 1 year ago

Updated by Adam King about 1 year ago

Updated by Adrian Nicolae about 1 year ago