Bug #53538
closedmgr/stats: ZeroDivisionError
100%
Description
root@service-01-08020:~# ceph osd status storage-01-08002 Error EINVAL: Traceback (most recent call last): File "/usr/share/ceph/mgr/mgr_module.py", line 1623, in _handle_command return CLICommand.COMMANDS[cmd['prefix']].call(self, cmd, inbuf) File "/usr/share/ceph/mgr/mgr_module.py", line 416, in call return self.func(mgr, **kwargs) File "/usr/share/ceph/mgr/status/module.py", line 338, in handle_osd_status wr_ops_rate = (self.get_rate("osd", osd_id.__str__(), "osd.op_w") + File "/usr/share/ceph/mgr/status/module.py", line 28, in get_rate return (data[-1][1] - data[-2][1]) // int(data[-1][0] - data[-2][0]) ZeroDivisionError: integer division or modulo by zero
Since those PRs:
- https://github.com/ceph/ceph/pull/25337
- https://github.com/ceph/ceph/pull/26270
- https://github.com/ceph/ceph/pull/26270/files#diff-dc6485f717f4dce4863733896375af75963412ebb2abc4b62fcd1f5233eee07dR44
- https://github.com/ceph/ceph/pull/28603
- https://tracker.ceph.com/issues/43224#note-11
no one had the patience to look into this all over again.
Updated by Sebastian Wagner over 2 years ago
- Related to Feature #40365: mgr: Add get_rates_from_data from the dashboard to the mgr_util.py added
Updated by Neha Ojha over 2 years ago
- Priority changed from Normal to Urgent
[ubuntu@gibba001 ~]$ sudo ceph osd status|grep gibba043|wc -l Error EINVAL: Traceback (most recent call last): File "/usr/share/ceph/mgr/mgr_module.py", line 1648, in _handle_command return CLICommand.COMMANDS[cmd['prefix']].call(self, cmd, inbuf) File "/usr/share/ceph/mgr/mgr_module.py", line 434, in call return self.func(mgr, **kwargs) File "/usr/share/ceph/mgr/status/module.py", line 338, in handle_osd_status wr_ops_rate = (self.get_rate("osd", osd_id.__str__(), "osd.op_w") + File "/usr/share/ceph/mgr/status/module.py", line 28, in get_rate return (data[-1][1] - data[-2][1]) // int(data[-1][0] - data[-2][0]) ZeroDivisionError: integer division or modulo by zero
Updated by Nitzan Mordechai about 2 years ago
From my understanding, the status update interval by mgr_stats_period configuration value, the division is failing when that statement int(data[-1]0 - data[-2]0) = 0 , data[-1]0 and data[-2]0 holding timestamps of the 2 last stats that were updated, to get 0 in that Subtraction - (data[-1]0 == data[-2]0) or (data[-1]0 - data[-2]0 < 1) we are checking only for the first condition and both cases can only happen when we update the stats fast enough - that means we had mgr_stats_period = 1 or something else caused the stats to be updated in range of less then 1 second.
Updated by Sebastian Wagner about 2 years ago
Nitzan Mordechai wrote:
From my understanding, the status update interval by mgr_stats_period configuration value, the division is failing when that statement int(data[-1]0 - data[-2]0) = 0 , data[-1]0 and data[-2]0 holding timestamps of the 2 last stats that were updated, to get 0 in that Subtraction - (data[-1]0 == data[-2]0) or (data[-1]0 - data[-2]0 < 1) we are checking only for the first condition and both cases can only happen when we update the stats fast enough - that means we had mgr_stats_period = 1 or something else caused the stats to be updated in range of less then 1 second.
Yes and we had this is problem in the dashbaord already. Our solution was https://github.com/ceph/ceph/pull/28603 . If you just use the same function also in the stats module, things should work properly
Updated by Neha Ojha about 2 years ago
- Status changed from New to Fix Under Review
- Backport set to pacific,quincy
- Pull request ID set to 44752
Updated by Neha Ojha about 2 years ago
- Has duplicate Bug #54213: ceph osd status - ZeroDivisionError: integer division or modulo by zero added
Updated by Neha Ojha about 2 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Backport Bot about 2 years ago
- Copied to Backport #54281: pacific: mgr/stats: ZeroDivisionError added
Updated by Backport Bot about 2 years ago
- Copied to Backport #54282: quincy: mgr/stats: ZeroDivisionError added
Updated by Backport Bot over 1 year ago
- Tags changed from low-hanging-fruit to low-hanging-fruit backport_processed
Updated by Konstantin Shalygin 4 months ago
- Status changed from Pending Backport to Resolved
- % Done changed from 0 to 100