Project

General

Profile

Actions

Bug #43008

closed

mgr/dashboard: a failure in rbd-mirror makes other dashboard pages fail

Added by Ernesto Puerta over 4 years ago. Updated about 3 years ago.

Status:
Duplicate
Priority:
Normal
Category:
Component - RBD Mirroring
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

On QE testing, during a build upgrade, a previous rbd-mirror daemon got hung, and a new started running. While this situation is external to dashboard, it caused a failure not only in rbd related pages, but also in Pools or Hosts.

The cause is that the summary endpoint raises an Exception:

traceback: "Traceback (most recent call last):
  File "/lib/python3.6/site-packages/cherrypy/_cprequest.py", line 670, in respond
    response.body = self.handler()
  File "/lib/python3.6/site-packages/cherrypy/lib/encoding.py", line 220, in __call__
    self.body = self.oldhandler(*args, **kwargs)
  File "/lib/python3.6/site-packages/cherrypy/_cptools.py", line 237, in wrap
    return self.newhandler(innerfunc, *args, **kwargs)
  File "/usr/share/ceph/mgr/dashboard/services/exception.py", line 88, in dashboard_exception_handler
    return handler(*args, **kwargs)
  File "/lib/python3.6/site-packages/cherrypy/_cpdispatch.py", line 60, in __call__
    return self.callable(*self.args, **self.kwargs)
  File "/usr/share/ceph/mgr/dashboard/controllers/__init__.py", line 649, in inner
    ret = func(*args, **kwargs)
  File "/usr/share/ceph/mgr/dashboard/controllers/summary.py", line 86, in __call__
    result['rbd_mirroring'] = self._rbd_mirroring()
  File "/usr/share/ceph/mgr/dashboard/controllers/summary.py", line 22, in _rbd_mirroring
    _, data = get_daemons_and_pools()
  File "/usr/share/ceph/mgr/dashboard/tools.py", line 244, in wrapper
    return rvc.run(fn, args, kwargs)
  File "/usr/share/ceph/mgr/dashboard/tools.py", line 226, in run
    raise self.exception
  File "/usr/share/ceph/mgr/dashboard/tools.py", line 147, in run
    val = self.fn(*self.args, **self.kwargs)
  File "/usr/share/ceph/mgr/dashboard/controllers/rbd_mirroring.py", line 185, in get_daemons_and_pools
    daemons = get_daemons()
  File "/usr/share/ceph/mgr/dashboard/controllers/rbd_mirroring.py", line 56, in get_daemons
    status = json.loads(status['json'])
TypeError: 'NoneType' object is not subscriptable

While dashboard cannot and (IMHO) shouldn't handle all possible failures in core Ceph components, it should be at least:
  • resilient to those failures,
  • if not possible, do not let failures impact other components (fault confinement).

The error described in this specific issue is easy to fix (catch TypeError exception). However, this approach is hard to be maintained across all dashboard codebase (it'd result in defensive programming and scattered try-excepts every line of code).

A possible solution could be to add a validation & data adaptation layer between ceph-mgr API and the back-end. This layer would validate the expected inputs against a schema, and provide a single place to encode the fallback behaviour in case of validation failures (vs. scattered handling logic).


Related issues 1 (0 open1 closed)

Is duplicate of Dashboard - Bug #43029: mgr/dashboard: RBD mirroring page results in "500 - internal server error"ResolvedJason Dillaman

Actions
Actions #1

Updated by Ernesto Puerta over 4 years ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 31881
Actions #2

Updated by Ernesto Puerta over 4 years ago

  • Assignee set to Ernesto Puerta
Actions #3

Updated by Ernesto Puerta over 4 years ago

  • Status changed from Fix Under Review to Duplicate
Actions #4

Updated by Ricardo Marques over 4 years ago

  • Is duplicate of Bug #43029: mgr/dashboard: RBD mirroring page results in "500 - internal server error" added
Actions #5

Updated by Ernesto Puerta about 3 years ago

  • Project changed from mgr to Dashboard
  • Category changed from 140 to Component - RBD Mirroring
Actions

Also available in: Atom PDF