Actions
Bug #23752
closedExceptions on two datapoints with same timestamp
Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
ceph-mgr
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
From list thread: "[ceph-users] ZeroDivisionError: float division by zero in /usr/lib/ceph/mgr/dashboard/module.py (12.2.4)"
ceph-mgr[1324]: [15/Apr/2018:09:47:12] HTTP Traceback (most recent call last): ceph-mgr[1324]: File "/usr/lib/python2.7/dist-packages/cherrypy/_cprequest.py", line 670, in respond ceph-mgr[1324]: response.body = self.handler() ceph-mgr[1324]: File "/usr/lib/python2.7/dist-packages/cherrypy/lib/encoding.py", line 217, in __call__ ceph-mgr[1324]: self.body = self.oldhandler(*args, **kwargs) ceph-mgr[1324]: File "/usr/lib/python2.7/dist-packages/cherrypy/lib/jsontools.py", line 63, in json_handler ceph-mgr[1324]: value = cherrypy.serving.request._json_inner_handler(*args, **kwargs) ceph-mgr[1324]: File "/usr/lib/python2.7/dist-packages/cherrypy/_cpdispatch.py", line 61, in __call__ ceph-mgr[1324]: return self.callable(*self.args, **self.kwargs) ceph-mgr[1324]: File "/usr/lib/ceph/mgr/dashboard/module.py", line 991, in list_data ceph-mgr[1324]: return self._osds_by_server() ceph-mgr[1324]: File "/usr/lib/ceph/mgr/dashboard/module.py", line 1040, in _osds_by_server ceph-mgr[1324]: osd_map.osds_by_id[osd_id]) ceph-mgr[1324]: File "/usr/lib/ceph/mgr/dashboard/module.py", line 1007, in _osd_summary ceph-mgr[1324]: result['stats'][s.split(".")[1]] = global_instance().get_rate('osd', osd_spec, s) ceph-mgr[1324]: File "/usr/lib/ceph/mgr/dashboard/module.py", line 268, in get_rate ceph-mgr[1324]: return (data[-1][1] - data[-2][1]) / float(data[-1][0] - data[-2][0]) ceph-mgr[1324]: ZeroDivisionError: float division by zero
I'm opening this as a ceph-mgr bug rather than in a specific module, because it strikes me as odd that we had two datapoints with the same timestamp (hence zero delta and the exception) to begin with.
I wonder if there is something happening during a failure that's getting a too-quick resend of stats when sessions bounce or similar?
Actions