Project

General

Profile

Bug #23276

balancer/osd: segfault in calc_pg_upmaps

Added by Dan van der Ster about 6 years ago. Updated almost 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
balancer module
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This cluster has upmapped pgs and the upmap balancer enabled. We moved two hosts from root=default to a new root=drain which would remove all PGs from those hosts.

The mgr started asserting in src/osd/OSDMap.cc on line 3943:

     0> 2018-03-08 16:58:51.920750 7f1981c61700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DI
ST/centos7/MACHINE_SIZE/huge/release/12.2.4/rpm/el7/BUILD/ceph-12.2.4/src/osd/OSDMap.cc: In function 'int OSDMap::calc_pg_upmaps(CephContext*, float, int, co
nst std::set<long int>&, OSDMap::Incremental*)' thread 7f1981c61700 time 2018-03-08 16:58:51.917824
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.4/rpm/el7
/BUILD/ceph-12.2.4/src/osd/OSDMap.cc: 3943: FAILED assert(target > 0)

 ceph version 12.2.4 (52085d5249a80c5f5121a76d6288429f35e4e77b) luminous (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x561e30846b50]
 2: (OSDMap::calc_pg_upmaps(CephContext*, float, int, std::set<long, std::less<long>, std::allocator<long> > const&, OSDMap::Incremental*)+0x1d9c) [0x561e309
3a66c]
 3: (()+0x2e41ba) [0x561e306ff1ba]
 4: (PyEval_EvalFrameEx()+0x6df0) [0x7f19a73e5bb0]
 5: (PyEval_EvalCodeEx()+0x7ed) [0x7f19a73e7efd]
 6: (PyEval_EvalFrameEx()+0x663c) [0x7f19a73e53fc]
 7: (PyEval_EvalFrameEx()+0x67bd) [0x7f19a73e557d]
 8: (PyEval_EvalFrameEx()+0x67bd) [0x7f19a73e557d]
 9: (PyEval_EvalCodeEx()+0x7ed) [0x7f19a73e7efd]
 10: (()+0x70858) [0x7f19a7371858]
 11: (PyObject_Call()+0x43) [0x7f19a734c9a3]
 12: (()+0x5a995) [0x7f19a735b995]
 13: (PyObject_Call()+0x43) [0x7f19a734c9a3]
 14: (()+0x4ba85) [0x7f19a734ca85]
 15: (PyObject_CallMethod()+0xbb) [0x7f19a734cdbb]
 16: (PyModuleRunner::serve()+0x5c) [0x561e306fcdac]
 17: (PyModuleRunner::PyModuleRunnerThread::entry()+0x6f) [0x561e306fd42f]
 18: (()+0x7e25) [0x7f19a5603e25]
 19: (clone()+0x6d) [0x7f19a46e634d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

History

#2 Updated by Dan van der Ster about 6 years ago

Great thanks.

Could someone please add the backports tag for l?

#3 Updated by xie xingguo about 6 years ago

Hi Dan,
There is already a pending backport for Luminous, see https://github.com/ceph/ceph/pull/20840

#4 Updated by Sage Weil almost 6 years ago

  • Status changed from New to Resolved

Also available in: Atom PDF