Bug #41525: mgr/dashboard: Missing service metadata is not handled correctly - Dashboard - Ceph

Actions

Copy link

Bug #41525

closed

mgr/dashboard: Missing service metadata is not handled correctly

Added by Volker Theile over 4 years ago. Updated about 3 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Volker Theile

Category:

General - Back-end

Target version:

Ceph - v15.0.0

% Done:

Source:

Development

Tags:

Backport:

nautilus

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v14.2.3, Ceph - v15.0.0

ceph-qa-suite:

Pull request ID:

31231

Crash signature (v1):

Crash signature (v2):

Description

The Dashboard REST API throws a Python error when fetching metadata for service mds.X fails. Instead of a Python error an exception should be thrown.

2019-08-27T08:14:23.158+0000 7f5f19032700 -1 mgr get_metadata_python Requested missing service mds.a
2019-08-27T08:14:23.162+0000 7f5f19032700  0 mgr[dashboard] [27/Aug/2019:08:14:23] HTTP 
Traceback (most recent call last):
  File "/usr/lib/python3.7/site-packages/cherrypy/_cprequest.py", line 628, in respond
    self._do_respond(path_info)
  File "/usr/lib/python3.7/site-packages/cherrypy/_cprequest.py", line 687, in _do_respond
    response.body = self.handler()
  File "/usr/lib/python3.7/site-packages/cherrypy/lib/encoding.py", line 219, in __call__
    self.body = self.oldhandler(*args, **kwargs)
  File "/usr/lib/python3.7/site-packages/cherrypy/_cptools.py", line 230, in wrap
    return self.newhandler(innerfunc, *args, **kwargs)
  File "/ceph/src/pybind/mgr/dashboard/services/exception.py", line 88, in dashboard_exception_handler
    return handler(*args, **kwargs)
  File "/usr/lib/python3.7/site-packages/cherrypy/_cpdispatch.py", line 54, in __call__
    return self.callable(*self.args, **self.kwargs)
  File "/ceph/src/pybind/mgr/dashboard/controllers/__init__.py", line 645, in inner
    ret = func(*args, **kwargs)
  File "/ceph/src/pybind/mgr/dashboard/controllers/__init__.py", line 838, in wrapper
    return func(*vpath, **params)
  File "/ceph/src/pybind/mgr/dashboard/controllers/cephfs.py", line 32, in get
    return self.fs_status(fs_id)
  File "/ceph/src/pybind/mgr/dashboard/controllers/cephfs.py", line 179, in fs_status
    mds_versions[metadata.get('ceph_version', 'unknown')].append(
AttributeError: 'NoneType' object has no attribute 'get'

Related issues 7 (0 open — 7 closed)

Related to CephFS - Bug #41538: mds: wrong compat can cause MDS to be added daemon registry on mgr but not the fsmap

Resolved

Patrick Donnelly

Actions

Related to mgr - Bug #23967: ceph fs status and Dashboard fail with Python stack trace

Duplicate

05/02/2018

Actions

Related to CephFS - Bug #42494: ceph: config show can't locate mds

Resolved

Sage Weil

Actions

Related to CephFS - Bug #36370: add information about active scrubs to "ceph -s" (and elsewhere)

Resolved

Venky Shankar

Actions

Has duplicate mgr - Bug #41674: tasks.mgr.dashboard.test_cephfs.CephfsTest fails, metadata is None

Duplicate

09/05/2019

Actions

Has duplicate Dashboard - Bug #41798: mgr/dashboard: Error 500 when clicking the MDS host on the Filesystems page (Raising "AttributeError: 'NoneType' object has no attribute 'get'")

Duplicate

Actions

Copied to Dashboard - Backport #42569: nautilus: mgr/dashboard: Missing service metadata is not handled correctly

Resolved

Venky Shankar

Actions

Copy link

Updated by Volker Theile over 4 years ago

Status changed from New to In Progress

Actions

Copy link

Updated by Volker Theile over 4 years ago

Status changed from In Progress to Fix Under Review
Pull request ID set to 29924

Actions

Copy link

Updated by Kefu Chai over 4 years ago

when an mds boots, it sends a mdsbeacon to mon to get it added to the fsmap. but monitor rejected because it thought that the mds was not compatible with the fsmap.

2019-08-27T07:26:33.741+0000 7f7e93435700  5 mon.a@0(leader).mds e3 preprocess_beacon mdsbeacon(4302/a up:boot seq 1 v0) v7 from mds.? [v2:172.21.4.106:6810/3881629364,v1:172.21.4.106:6812/3881629364] com
pat={},rocompat={},incompat={}
2019-08-27T07:26:33.741+0000 7f7e93435700  1 mon.a@0(leader).mds e3  mds mds.? [v2:172.21.4.106:6810/3881629364,v1:172.21.4.106:6812/3881629364] can't write to fsmap compat={},rocompat={},incompat={1=base
 v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2}
2019-08-27T07:26:33.741+0000 7f7e93435700 10 mon.a@0(leader) e1 no_reply to mds.? v2:172.21.4.106:6810/3881629364 mdsbeacon(4302/a up:boot seq 1 v0) v7

so even though mds managed to be added to daemon registry on mgr, it cannot update its status. because when mgr gets an update from mds, it queries mon to see if it's in the fsmap. and mon just replied that the mds did not exist:

2019-08-27T07:26:33.744+0000 7f7e93435700  1 -- [v2:172.21.4.106:3300/0,v1:172.21.4.106:6789/0] <== mgr.4111 172.21.4.106:0/6595 77 ==== mon_command({"prefix": "mds metadata", "who": "a"} v 0) v1 ==== 80+
0+0 (crc 0 0 0) 0x55678618da00 con 0x556785de7400
2019-08-27T07:26:33.744+0000 7f7e93435700 20 mon.a@0(leader) e1 _ms_dispatch existing session 0x556785e47500 for mgr.4111
2019-08-27T07:26:33.744+0000 7f7e93435700 20 mon.a@0(leader) e1  caps allow profile mgr
2019-08-27T07:26:33.744+0000 7f7e93435700  0 mon.a@0(leader) e1 handle_command mon_command({"prefix": "mds metadata", "who": "a"} v 0) v1
2019-08-27T07:26:33.744+0000 7f7e93435700 20 is_capable service=mds command=mds metadata read addr 172.21.4.106:0/6595 on cap allow profile mgr
2019-08-27T07:26:33.744+0000 7f7e93435700 20  allow so far , doing grant allow profile mgr
2019-08-27T07:26:33.744+0000 7f7e93435700 20  match
2019-08-27T07:26:33.744+0000 7f7e93435700 10 mon.a@0(leader) e1 _allowed_command capable
2019-08-27T07:26:33.744+0000 7f7e93435700  0 log_channel(audit) log [DBG] : from='mgr.4111 172.21.4.106:0/6595' entity='mgr.x' cmd=[{"prefix": "mds metadata", "who": "a"}]: dispatch
2019-08-27T07:26:33.744+0000 7f7e93435700  1 -- [v2:172.21.4.106:3300/0,v1:172.21.4.106:6789/0] --> [v2:172.21.4.106:3300/0,v1:172.21.4.106:6789/0] -- log(1 entries from seq 120 at 2019-08-27T07:26:33.745
580+0000) v1 -- 0x556785f44240 con 0x5567850f1400
2019-08-27T07:26:33.745+0000 7f7e93435700 10 mon.a@0(leader).paxosservice(mdsmap 1..3) dispatch 0x55678618da00 mon_command({"prefix": "mds metadata", "who": "a"} v 0) v1 from mgr.4111 172.21.4.106:0/6595
con 0x556785de7400
2019-08-27T07:26:33.745+0000 7f7e93435700  5 mon.a@0(leader).paxos(paxos active c 1..71) is_readable = 1 - now=2019-08-27T07:26:33.745641+0000 lease_expire=2019-08-27T07:26:38.569920+0000 has v0 lc 71
2019-08-27T07:26:33.745+0000 7f7e93435700 10 mon.a@0(leader).mds e3 preprocess_query mon_command({"prefix": "mds metadata", "who": "a"} v 0) v1 from mgr.4111 172.21.4.106:0/6595
2019-08-27T07:26:33.745+0000 7f7e93435700  1 mon.a@0(leader).mds e3 all = 0
2019-08-27T07:26:33.745+0000 7f7e93435700  2 mon.a@0(leader) e1 send_reply 0x556785f5fc70 0x55678618d800 mon_command_ack([{"prefix": "mds metadata", "who": "a"}]=-22 MDS named 'a' does not exist, or is not up v3) v1
2019-08-27T07:26:33.745+0000 7f7e93435700  1 -- [v2:172.21.4.106:3300/0,v1:172.21.4.106:6789/0] --> 172.21.4.106:0/6595 -- mon_command_ack([{"prefix": "mds metadata", "who": "a"}]=-22 MDS named 'a' does not exist, or is not up v3) v1 -- 0x55678618d800 con 0x556785de7400

that's why mgr failed to return the mds' daemon status even though it's updated by corresponding mds' status report before:

2019-08-27T07:26:33.744+0000 7f1493af3700 10 mgr.server handle_open from 0x55a183210000  mds,a
2019-08-27T07:26:33.744+0000 7f1493af3700  1 -- [v2:172.21.4.106:6800/6595,v1:172.21.4.106:6801/6595] --> [v2:172.21.4.106:6810/3881629364,v1:172.21.4.106:6812/3881629364] -- mgrconfigure(period=5, thresh
old=5) v3 -- 0x55a1830e6e00 con 0x55a183210000
2019-08-27T07:26:33.744+0000 7f14b0e45700  1 --2- [v2:172.21.4.106:6800/6595,v1:172.21.4.106:6801/6595] >> [v2:172.21.4.106:6811/1098026670,v1:172.21.4.106:6813/1098026670] conn(0x55a182f3d800 0x55a18329e
000 crc :-1 s=READY pgs=4 cs=0 l=1 rx=0 tx=0).ready entity=mds.? client_cookie=0 server_cookie=0 in_seq=0 out_seq=0
2019-08-27T07:26:33.744+0000 7f1493af3700  1 -- [v2:172.21.4.106:6800/6595,v1:172.21.4.106:6801/6595] <== mds.? v2:172.21.4.106:6810/3881629364 2 ==== mgrreport(mds.a +0-0 packed 6) v8 ==== 41+0+0 (crc 0
0 0) 0x55a18058f180 con 0x55a183210000
2019-08-27T07:26:33.744+0000 7f1493af3700 10 mgr.server handle_report from 0x55a183210000 mds,a
2019-08-27T07:26:33.744+0000 7f1493af3700  5 mgr.server handle_report rejecting report from mds,a, since we do not have its metadata now.

probably it's a bug on mds side?

see /a/kchai-2019-08-27_06:58:14-rados-wip-kefu-testing-2019-08-27-1029-distro-basic-mira/4256553

Actions

Copy link

Updated by Volker Theile over 4 years ago

Thanks for the detailed explanation of the underlying behaviour.

probably it's a bug on mds side?

Yes it's surely a bug on the MDS side, but the Dashboard REST API should handle this error correctly what it doesn't at the moment. The attached PR will fix that and the Dashboard frontend will work correct, even there is a problem with fetching MDS metadata.

Actions

Copy link

Updated by Patrick Donnelly over 4 years ago

Related to Bug #41538: mds: wrong compat can cause MDS to be added daemon registry on mgr but not the fsmap added

Actions

Copy link

Updated by Volker Theile over 4 years ago

Related to Bug #23967: ceph fs status and Dashboard fail with Python stack trace added

Actions

Copy link

Updated by Sebastian Wagner over 4 years ago

Has duplicate Bug #41674: tasks.mgr.dashboard.test_cephfs.CephfsTest fails, metadata is None added

Actions

Copy link

Updated by Lenz Grimmer over 4 years ago

Has duplicate Bug #41798: mgr/dashboard: Error 500 when clicking the MDS host on the Filesystems page (Raising "AttributeError: 'NoneType' object has no attribute 'get'") added

Actions

Copy link

Updated by Kefu Chai over 4 years ago

Has duplicate Bug #42011: test_perf_counters_mds_get (tasks.mgr.dashboard.test_perf_counters.PerfCountersControllerTest) ... FAIL added

Actions

Copy link

#10

Updated by Volker Theile over 4 years ago

Status changed from Fix Under Review to 12

PR has been closed because the issue must be fixed in the manager.

Actions

Copy link

#11

Updated by Kefu Chai over 4 years ago

Has duplicate deleted (Bug #42011: test_perf_counters_mds_get (tasks.mgr.dashboard.test_perf_counters.PerfCountersControllerTest) ... FAIL)

Actions

Copy link

#12

Updated by Min Shi over 4 years ago

I think the issue is similiar with this issue (https://tracker.ceph.com/issues/40197). When mds restart and at the same time the leader monitor stoped, it may cause mds metadata updated wrongly.

Actions

Copy link

#13

Updated by Patrick Donnelly over 4 years ago

Status changed from 12 to Fix Under Review
Pull request ID changed from 29924 to 31231

Actions

Copy link

#14

Updated by Sebastian Wagner over 4 years ago

Related to Bug #42494: ceph: config show can't locate mds added

Actions

Copy link

#15

Updated by Patrick Donnelly over 4 years ago

Status changed from Fix Under Review to Pending Backport
Start date deleted (~~08/27/2019~~)
Backport changed from nautilus to nautilus,mimic

Actions

Copy link

#16

Updated by Patrick Donnelly over 4 years ago

Backport note: https://tracker.ceph.com/issues/42494#note-5

Actions

Copy link

#18

Updated by Patrick Donnelly over 4 years ago

Copied to Backport #42569: nautilus: mgr/dashboard: Missing service metadata is not handled correctly added

Actions

Copy link

#19

Updated by Patrick Donnelly over 4 years ago

Backport changed from nautilus,mimic to nautilus

#36370 will not be backported to Mimic.

Actions

Copy link

#20

Updated by Patrick Donnelly over 4 years ago

Related to Bug #36370: add information about active scrubs to "ceph -s" (and elsewhere) added

Actions

Copy link

#21

Updated by Nathan Cutler about 4 years ago

Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions

Copy link

#22

Updated by Ernesto Puerta about 3 years ago

Project changed from mgr to Dashboard
Category changed from 146 to General - Back-end

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » mgr » Dashboard

Custom queries

Bug #41525

mgr/dashboard: Missing service metadata is not handled correctly

Updated by Volker Theile over 4 years ago

Updated by Volker Theile over 4 years ago

Updated by Kefu Chai over 4 years ago

Updated by Volker Theile over 4 years ago

Updated by Patrick Donnelly over 4 years ago

Updated by Volker Theile over 4 years ago

Updated by Sebastian Wagner over 4 years ago

Updated by Lenz Grimmer over 4 years ago

Updated by Kefu Chai over 4 years ago

Updated by Volker Theile over 4 years ago

Updated by Kefu Chai over 4 years ago

Updated by Min Shi over 4 years ago

Updated by Patrick Donnelly over 4 years ago

Updated by Sebastian Wagner over 4 years ago

Updated by Patrick Donnelly over 4 years ago

Updated by Patrick Donnelly over 4 years ago

Updated by Patrick Donnelly over 4 years ago

Updated by Patrick Donnelly over 4 years ago

Updated by Patrick Donnelly over 4 years ago

Updated by Nathan Cutler about 4 years ago

Updated by Ernesto Puerta about 3 years ago