Bug #47272
closedmgr/pg_autoscaler : Module 'pg_autoscaler' has failed: 'op'
0%
Description
When we try to enable pg_autoscaler module, the pg_autoscaler module getting KeyError exception :
2020-09-02 17:57:32.903 7f1c1d551700 -1 Traceback (most recent call last):
File "/usr/share/ceph/mgr/pg_autoscaler/module.py", line 197, in serve
self._maybe_adjust()
File "/usr/share/ceph/mgr/pg_autoscaler/module.py", line 399, in _maybe_adjust
ps, root_map, pool_root = self._get_pool_status(osdmap, pools)
File "/usr/share/ceph/mgr/pg_autoscaler/module.py", line 293, in _get_pool_status
root_map, pool_root = self.get_subtree_resource_status(osdmap, crush_map)
File "/usr/share/ceph/mgr/pg_autoscaler/module.py", line 231, in get_subtree_resource_status
root_id = int(crush.get_rule_root(cr_name))
File "/usr/share/ceph/mgr/mgr_module.py", line 289, in get_rule_root
first_take = [s for s in rule['steps'] if s['op'] 'take'][0]
File "/usr/share/ceph/mgr/mgr_module.py", line 289, in <listcomp>
first_take = [s for s in rule['steps'] if s['op'] 'take']0
KeyError: 'op'
2020-09-02 17:57:32.907 7f1c1d551700 20 mgr ~Gil Destroying new thread state 0x5613fa17a900
and cluster gets into HEALTH_ERR
- ceph -s
cluster:
id: 7f8b3389-5759-4798-8cd8-6fad4a9760a1
health: HEALTH_ERR
Module 'pg_autoscaler' has failed: 'op'
too few PGs per OSD (4 < min 30)
This issue is reproducible if we have a pool which has crush_rule set to replicated rule having 'step set_chooseleaf_vary_r 1'.