Project

General

Profile

Actions

Bug #49576

closed

mgr/balancer: KeyError messages in balancer module

Added by David Zafman about 3 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
balancer module
Target version:
% Done:

0%

Source:
Tags:
Backport:
pacific, octopus, nautilus
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

we've hit problem with balancer on two of our cluster.
ceph health suddenly spits:
MGR_MODULE_ERROR Module 'balancer' has failed: (40,)

manager log then shows following:

2019-11-01 14:57:44.112 7f497f642700 -1 balancer.serve:
2019-11-01 14:57:44.112 7f497f642700 -1 Traceback (most recent call last):
File "/usr/lib64/ceph/mgr/balancer/module.py", line 425, in serve
r, detail = self.optimize(plan)
File "/usr/lib64/ceph/mgr/balancer/module.py", line 693, in optimize
return self.do_crush_compat(plan)
File "/usr/lib64/ceph/mgr/balancer/module.py", line 839, in do_crush_compat
weight = best_ws[osd]
KeyError: (40,)

we're using 13.2.6 on CENTOS7. don't have this problem on multiple other clusters running same version.

if I can provide further details, please let me know.


Files

map.gz (3.88 KB) map.gz Nikola Ciprich, 11/14/2019 07:49 PM

Related issues 5 (0 open5 closed)

Has duplicate mgr - Bug #49535: nautilus: mgr/balancer: KeyError messages in balancer moduleDuplicate

Actions
Copied from mgr - Bug #42721: mgr/balancer: KeyError messages in balancer moduleResolvedSage Weil

Actions
Copied to mgr - Backport #49759: nautilus: mgr/balancer: KeyError messages in balancer moduleResolvedNeha OjhaActions
Copied to mgr - Backport #49760: pacific: mgr/balancer: KeyError messages in balancer moduleResolvedNeha OjhaActions
Copied to mgr - Backport #49761: octopus: mgr/balancer: KeyError messages in balancer moduleResolvedNeha OjhaActions
Actions

Also available in: Atom PDF