Bug #23276
balancer/osd: segfault in calc_pg_upmaps
Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
balancer module
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
This cluster has upmapped pgs and the upmap balancer enabled. We moved two hosts from root=default to a new root=drain which would remove all PGs from those hosts.
The mgr started asserting in src/osd/OSDMap.cc on line 3943:
0> 2018-03-08 16:58:51.920750 7f1981c61700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DI ST/centos7/MACHINE_SIZE/huge/release/12.2.4/rpm/el7/BUILD/ceph-12.2.4/src/osd/OSDMap.cc: In function 'int OSDMap::calc_pg_upmaps(CephContext*, float, int, co nst std::set<long int>&, OSDMap::Incremental*)' thread 7f1981c61700 time 2018-03-08 16:58:51.917824 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.4/rpm/el7 /BUILD/ceph-12.2.4/src/osd/OSDMap.cc: 3943: FAILED assert(target > 0) ceph version 12.2.4 (52085d5249a80c5f5121a76d6288429f35e4e77b) luminous (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x561e30846b50] 2: (OSDMap::calc_pg_upmaps(CephContext*, float, int, std::set<long, std::less<long>, std::allocator<long> > const&, OSDMap::Incremental*)+0x1d9c) [0x561e309 3a66c] 3: (()+0x2e41ba) [0x561e306ff1ba] 4: (PyEval_EvalFrameEx()+0x6df0) [0x7f19a73e5bb0] 5: (PyEval_EvalCodeEx()+0x7ed) [0x7f19a73e7efd] 6: (PyEval_EvalFrameEx()+0x663c) [0x7f19a73e53fc] 7: (PyEval_EvalFrameEx()+0x67bd) [0x7f19a73e557d] 8: (PyEval_EvalFrameEx()+0x67bd) [0x7f19a73e557d] 9: (PyEval_EvalCodeEx()+0x7ed) [0x7f19a73e7efd] 10: (()+0x70858) [0x7f19a7371858] 11: (PyObject_Call()+0x43) [0x7f19a734c9a3] 12: (()+0x5a995) [0x7f19a735b995] 13: (PyObject_Call()+0x43) [0x7f19a734c9a3] 14: (()+0x4ba85) [0x7f19a734ca85] 15: (PyObject_CallMethod()+0xbb) [0x7f19a734cdbb] 16: (PyModuleRunner::serve()+0x5c) [0x561e306fcdac] 17: (PyModuleRunner::PyModuleRunnerThread::entry()+0x6f) [0x561e306fd42f] 18: (()+0x7e25) [0x7f19a5603e25] 19: (clone()+0x6d) [0x7f19a46e634d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
History
#1 Updated by xie xingguo about 6 years ago
#2 Updated by Dan van der Ster about 6 years ago
Great thanks.
Could someone please add the backports tag for l?
#3 Updated by xie xingguo about 6 years ago
Hi Dan,
There is already a pending backport for Luminous, see https://github.com/ceph/ceph/pull/20840
#4 Updated by Sage Weil almost 6 years ago
- Status changed from New to Resolved