Bug #47272: mgr/pg_autoscaler : Module 'pg_autoscaler' has failed: 'op' - mgr - Ceph

Actions

Copy link

Bug #47272

closed

mgr/pg_autoscaler : Module 'pg_autoscaler' has failed: 'op'

Added by Prashant D over 3 years ago. Updated over 3 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Prashant D

Category:

pg_autoscaler module

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

36947

Crash signature (v1):

Crash signature (v2):

Description

When we try to enable pg_autoscaler module, the pg_autoscaler module getting KeyError exception :

2020-09-02 17:57:32.903 7f1c1d551700 -1 Traceback (most recent call last):
File "/usr/share/ceph/mgr/pg_autoscaler/module.py", line 197, in serve
self._maybe_adjust()
File "/usr/share/ceph/mgr/pg_autoscaler/module.py", line 399, in _maybe_adjust
ps, root_map, pool_root = self._get_pool_status(osdmap, pools)
File "/usr/share/ceph/mgr/pg_autoscaler/module.py", line 293, in _get_pool_status
root_map, pool_root = self.get_subtree_resource_status(osdmap, crush_map)
File "/usr/share/ceph/mgr/pg_autoscaler/module.py", line 231, in get_subtree_resource_status
root_id = int(crush.get_rule_root(cr_name))
File "/usr/share/ceph/mgr/mgr_module.py", line 289, in get_rule_root
first_take = [s for s in rule['steps'] if s['op'] 'take'][0]
File "/usr/share/ceph/mgr/mgr_module.py", line 289, in <listcomp>
first_take = [s for s in rule['steps'] if s['op'] 'take']⁰
KeyError: 'op'

2020-09-02 17:57:32.907 7f1c1d551700 20 mgr ~Gil Destroying new thread state 0x5613fa17a900

and cluster gets into HEALTH_ERR

ceph -s
cluster:
id: 7f8b3389-5759-4798-8cd8-6fad4a9760a1
health: HEALTH_ERR
Module 'pg_autoscaler' has failed: 'op'
too few PGs per OSD (4 < min 30)

This issue is reproducible if we have a pool which has crush_rule set to replicated rule having 'step set_chooseleaf_vary_r 1'.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » mgr

Custom queries

Bug #47272

mgr/pg_autoscaler : Module 'pg_autoscaler' has failed: 'op'

Updated by Prashant D over 3 years ago

Updated by Neha Ojha over 3 years ago

Updated by Kefu Chai over 3 years ago