Bug #37501
Mgr: OSDMap.cc: 4140: FAILED assert(osd_weight.count(i.first))
0%
Description
2000-01-04 08:27:16.877315 7f9306dcb700 1 mgr[balancer] Handling command: '{'prefix': 'balancer optimize', 'plan': 'plan1', 'target': ['mgr', '']}'
2000-01-04 08:27:16.929349 7f9306dcb700 -1 /sds_env/ningtao/packages-rpms/BUILD/ceph-12.2.7-176-gd9d01ce/src/osd/OSDMap.cc: In function 'int OSDMap::calc_pg_upmaps(CephContext*, float, int, const std::set<long int>&, OSDMap::Incremental*)' thread 7f9306dcb700 time 2000-01-04 08:27:16.925063
/sds_env/ningtao/packages-rpms/BUILD/ceph-12.2.7-176-gd9d01ce/src/osd/OSDMap.cc: 4140: FAILED assert(osd_weight.count(i.first))
ceph version 12.2.7-176-gd9d01ce (d9d01ce2176a7a5af5c4bc843b1978530c0b582f) luminous (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x5582eee62440]
2: (OSDMap::calc_pg_upmaps(CephContext*, float, int, std::set<long, std::less<long>, std::allocator<long> > const&, OSDMap::Incremental*)+0xf3f) [0x5582eef5520f]
3: (()+0x2ea004) [0x5582eed0e004]
4: (PyEval_EvalFrameEx()+0x6df0) [0x7f931a0f0cf0]
5: (PyEval_EvalCodeEx()+0x7ed) [0x7f931a0f303d]
6: (PyEval_EvalFrameEx()+0x663c) [0x7f931a0f053c]
7: (PyEval_EvalFrameEx()+0x67bd) [0x7f931a0f06bd]
8: (PyEval_EvalFrameEx()+0x67bd) [0x7f931a0f06bd]
9: (PyEval_EvalCodeEx()+0x7ed) [0x7f931a0f303d]
10: (()+0x70978) [0x7f931a07c978]
11: (PyObject_Call()+0x43) [0x7f931a057a63]
12: (()+0x5aa55) [0x7f931a066a55]
13: (PyObject_Call()+0x43) [0x7f931a057a63]
14: (()+0x4bb45) [0x7f931a057b45]
15: (PyObject_CallMethod()+0xbb) [0x7f931a057e7b]
16: (ActivePyModule::handle_command(std::map<std::string, boost::variant<std::string<bool, long, double, std::vector<std::string, std::allocator<std::string> >, std::vector<long, std::allocator<long> >, std::vector<double, std::allocator<double> > > >, std::less<std::string>, std::allocator<std::pair<std::string const, std::string<bool, long, double, std::vector<std::string, std::allocator<std::string> >, std::vector<long, std::allocator<long> >, std::vector<double, std::allocator<double> > > > > > const&, std::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >*, std::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >)+0x2ed) [0x5582eed1949d]
17: (()+0x2ab4b4) [0x5582eeccf4b4]
18: (FunctionContext::finish(int)+0x2a) [0x5582eece288a]
19: (Context::complete(int)+0x9) [0x5582eecddc79]
20: (Finisher::finisher_thread_entry()+0x198) [0x5582eee613d8]
21: (()+0x7e25) [0x7f9318181e25]
22: (clone()+0x6d) [0x7f931726434d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Related issues
History
#1 Updated by tao ning over 5 years ago
Operation steps:
1. Down and out an OSD;
2. Execute balancer plan
#2 Updated by xie xingguo over 5 years ago
I guess it should be fixed by https://github.com/ceph/ceph/pull/22630, can you try that?
#3 Updated by xie xingguo over 5 years ago
- Status changed from New to 4
#4 Updated by tao ning over 5 years ago
xie xingguo wrote:
I guess it should be fixed by https://github.com/ceph/ceph/pull/22630, can you try that?
I have included this PR, otherwise how could such an assert appear?
#5 Updated by xie xingguo over 5 years ago
tao ning wrote:
xie xingguo wrote:
I guess it should be fixed by https://github.com/ceph/ceph/pull/22630, can you try that?
I have included this PR, otherwise how could such an assert appear?
I guess the pool-type is EC? Can you post a complete log? (With log level 20, probably)
#6 Updated by tao ning over 5 years ago
xie xingguo wrote:
tao ning wrote:
xie xingguo wrote:
I guess it should be fixed by https://github.com/ceph/ceph/pull/22630, can you try that?
I have included this PR, otherwise how could such an assert appear?
I guess the pool-type is EC? Can you post a complete log?
Yeah, the log might not be there, so let me try to reproduce that
#7 Updated by tao ning over 5 years ago
- File Mgr balancer相关PR(From 12.2.7).txt View added
- File osd.dump.txt View added
- File osd.tree.txt View added
tao ning wrote:
xie xingguo wrote:
tao ning wrote:
xie xingguo wrote:
I guess it should be fixed by https://github.com/ceph/ceph/pull/22630, can you try that?
I have included this PR, otherwise how could such an assert appear?
I guess the pool-type is EC? Can you post a complete log?
Yeah, the log might not be there, so let me try to reproduce that
I found a way to do that, so I just put OSD out, not down
#8 Updated by tao ning over 5 years ago
#9 Updated by tao ning over 5 years ago
Please see if there is any problem with this modification
@ -4212,10 +4216,12
@ int OSDMap::calc_pg_upmaps(
if (q.second == osd) {
ldout(cct, 10) << " dropping pg_upmap_items " << pg
<< " " << p->second << dendl;
- for (auto i : p->second) {
- pgs_by_osd[i.second].erase(pg);
- pgs_by_osd[i.first].insert(pg);
- }
+ for (auto i : p->second) {
+ if (pgs_by_osd.count(i.second)) {
+ pgs_by_osd[i.second].erase(pg);
+ }
+ pgs_by_osd[i.first].insert(pg);
+ }
#10 Updated by xie xingguo over 5 years ago
#11 Updated by xie xingguo over 5 years ago
- Status changed from 4 to Pending Backport
- Backport set to luminous,mimic
#12 Updated by Nathan Cutler over 5 years ago
- Subject changed from luminous:Mgr: OSDMap.cc: 4140: FAILED assert(osd_weight.count(i.first)) to Mgr: OSDMap.cc: 4140: FAILED assert(osd_weight.count(i.first))
#13 Updated by Nathan Cutler over 5 years ago
- Copied to Backport #37743: luminous: Mgr: OSDMap.cc: 4140: FAILED assert(osd_weight.count(i.first)) added
#14 Updated by Nathan Cutler over 5 years ago
- Copied to Backport #37744: mimic: Mgr: OSDMap.cc: 4140: FAILED assert(osd_weight.count(i.first)) added
#15 Updated by Nathan Cutler about 5 years ago
- Status changed from Pending Backport to Resolved