Bug #44186
Module 'pg_autoscaler' has failed: division by zero
0%
Description
2020-02-18T15:25:54.452+0000 7fe0f37fa700 10 module pg_autoscaler health checks: { "severity": "HEALTH_WARN", "summary": { "message": "1 pools have both target_size_bytes and target_size_ratio set", "count": 1 }, "detail": [ { "message": "Pool a has target_size_bytes and target_size_ratio set" } ] } 2020-02-18T15:25:54.452+0000 7fe0f37fa700 -1 log_channel(cluster) log [ERR] : Unhandled exception from module 'pg_autoscaler' while running on mgr.x: division by zero 2020-02-18T15:25:54.452+0000 7fe0f37fa700 -1 pg_autoscaler.serve: 2020-02-18T15:25:54.452+0000 7fe0f37fa700 -1 ZeroDivisionError: division by zero
/a/sage-2020-02-18_14:47:43-rados-wip-sage2-testing-2020-02-17-2124-distro-basic-smithi/4777440
description: rados/singleton/{all/pg-autoscaler.yaml msgr-failures/many.yaml msgr/async.yaml
objectstore/bluestore-avl.yaml rados.yaml supported-random-distro$/{ubuntu_latest.yaml}}
Related issues
History
#1 Updated by Sage Weil about 4 years ago
- Status changed from New to Fix Under Review
- Pull request ID set to 33402
#2 Updated by Sage Weil about 4 years ago
- Status changed from Fix Under Review to Pending Backport
- Backport set to nautilus
#3 Updated by Konstantin Shalygin about 4 years ago
- Copied to Backport #44219: nautilus: Module 'pg_autoscaler' has failed: division by zero added
#4 Updated by Nathan Cutler almost 4 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".
#5 Updated by Sage Weil almost 4 years ago
- Status changed from Resolved to Need More Info
hrm, another instance: /a/sage-2020-04-01_21:31:45-rados-wip-sage3-testing-2020-04-01-1428-distro-basic-smithi/4914981
#6 Updated by Josh Durgin almost 4 years ago
- Tags set to low-hanging-fruit
#7 Updated by Kefu Chai almost 4 years ago
- Status changed from Need More Info to New
- Backport changed from nautilus to nautilus,octopus
2020-04-13T03:24:57.574 INFO:tasks.ceph.mgr.x.smithi052.stderr: File "/usr/share/ceph/mgr/pg_autoscaler/module.py", line 207, in serve 2020-04-13T03:24:57.574 INFO:tasks.ceph.mgr.x.smithi052.stderr: self._update_progress_events() 2020-04-13T03:24:57.574 INFO:tasks.ceph.mgr.x.smithi052.stderr: File "/usr/share/ceph/mgr/pg_autoscaler/module.py", line 415, in _update_progress_events 2020-04-13T03:24:57.574 INFO:tasks.ceph.mgr.x.smithi052.stderr: ev.update(self, (ev.pg_num - pool_data['pg_num']) / (ev.pg_num - ev.pg_num_target)) 2020-04-13T03:24:57.575 INFO:tasks.ceph.mgr.x.smithi052.stderr:ZeroDivisionError: division by zero
/a/kchai-2020-04-10_10:07:46-rados-wip-kefu-testing-2020-04-10-1430-distro-basic-smithi/4942579
#8 Updated by Neha Ojha almost 4 years ago
/a/teuthology-2020-06-05_07:01:02-rados-master-distro-basic-smithi/5119405
#9 Updated by Brad Hubbard almost 4 years ago
/a/yuriw-2020-05-29_15:51:00-rados-wip-yuri-testing-2020-05-28-2238-octopus-distro-basic-smithi/5103378
#10 Updated by Neha Ojha almost 4 years ago
- Assignee set to Neha Ojha
/a/nojha-2020-06-17_16:38:44-rados:singleton-master-distro-basic-smithi/5158406
#11 Updated by Neha Ojha almost 4 years ago
- Status changed from New to Fix Under Review
#12 Updated by Kefu Chai almost 4 years ago
- Status changed from Fix Under Review to Pending Backport
#13 Updated by Nathan Cutler almost 4 years ago
- Copied to Backport #46196: octopus: Module 'pg_autoscaler' has failed: division by zero added
#14 Updated by Nathan Cutler over 3 years ago
- Status changed from Pending Backport to Resolved
#15 Updated by Nathan Cutler over 3 years ago
- Related to Bug #46487: pybind/mgr/pg_autoscaler/module.py: do not update event if ev.pg_num== ev.pg_num_target added
#16 Updated by Nathan Cutler over 3 years ago
@Neha - reopening an old ticket that has already been backported makes it difficult to backport the second round of fixes. It's better to open a new ticket and mark it as "Related" to the old one. I went ahead and did that in this case so your fix can be backported via the usual backporting workflows.