Project

General

Profile

Bug #22337

Prometheus exporter in the MGR daemon crashes when PGs are in recovery_wait state

Added by Subhachandra Chandra almost 2 years ago. Updated almost 2 years ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
12/06/2017
Due date:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

When the cluster has one or more PGs in the "recovery_wait" the prometheus exporter in the MGR daemon returns the following error and backtrace.

500 Internal Server Error

The server encountered an unexpected condition which prevented it from fulfilling the request.

Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/cherrypy/_cprequest.py", line 670, in respond
response.body = self.handler()
File "/usr/lib/python2.7/dist-packages/cherrypy/lib/encoding.py", line 217, in call
self.body = self.oldhandler(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/cherrypy/_cpdispatch.py", line 61, in call
return self.callable(*self.args, **self.kwargs)
File "/usr/lib/ceph/mgr/prometheus/module.py", line 386, in metrics
metrics = global_instance().collect()
File "/usr/lib/ceph/mgr/prometheus/module.py", line 324, in collect
self.get_pg_status()
File "/usr/lib/ceph/mgr/prometheus/module.py", line 266, in get_pg_status
self.metrics[path].set(value)
KeyError: 'pg_recovery_wait'


Related issues

Duplicates mgr - Bug #22116: prometheus module 500 if 'deep' in pg states Resolved 11/13/2017

History

#1 Updated by Subhachandra Chandra almost 2 years ago

This is definitely not a "bluestore" bug. Not sure how it was classified under it when I hit the "New Issue" link.

#2 Updated by Subhachandra Chandra almost 2 years ago

Found another backtrace with PG in 'deep scrubbing' state

data:
pools: 1 pools, 4096 pgs
objects: 464k objects, 59393 GB
usage: 89488 GB used, 240 TB / 327 TB avail
pgs: 4094 active+clean
1 active+clean+scrubbing
1 active+clean+scrubbing+deep

Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/cherrypy/_cprequest.py", line 670, in respond
response.body = self.handler()
File "/usr/lib/python2.7/dist-packages/cherrypy/lib/encoding.py", line 217, in call
self.body = self.oldhandler(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/cherrypy/_cpdispatch.py", line 61, in call
return self.callable(*self.args, **self.kwargs)
File "/usr/lib/ceph/mgr/prometheus/module.py", line 386, in metrics
metrics = global_instance().collect()
File "/usr/lib/ceph/mgr/prometheus/module.py", line 324, in collect
self.get_pg_status()
File "/usr/lib/ceph/mgr/prometheus/module.py", line 266, in get_pg_status
self.metrics[path].set(value)
KeyError: 'pg_deep'

#3 Updated by Shinobu Kinjo almost 2 years ago

  • Project changed from bluestore to mgr

#4 Updated by John Spray almost 2 years ago

  • Duplicates Bug #22116: prometheus module 500 if 'deep' in pg states added

#5 Updated by John Spray almost 2 years ago

  • Status changed from New to Duplicate

Duplicate of http://tracker.ceph.com/issues/22116, which is currently fixed in master and pending backport to luminous

Also available in: Atom PDF