Project

General

Profile

Bug #37501

Mgr: OSDMap.cc: 4140: FAILED assert(osd_weight.count(i.first))

Added by tao ning 3 months ago. Updated 23 days ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
OSDMap
Target version:
-
Start date:
12/02/2018
Due date:
% Done:

0%

Source:
Tags:
Backport:
luminous,mimic
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

2000-01-04 08:27:16.877315 7f9306dcb700 1 mgr[balancer] Handling command: '{'prefix': 'balancer optimize', 'plan': 'plan1', 'target': ['mgr', '']}'
2000-01-04 08:27:16.929349 7f9306dcb700 -1 /sds_env/ningtao/packages-rpms/BUILD/ceph-12.2.7-176-gd9d01ce/src/osd/OSDMap.cc: In function 'int OSDMap::calc_pg_upmaps(CephContext*, float, int, const std::set<long int>&, OSDMap::Incremental*)' thread 7f9306dcb700 time 2000-01-04 08:27:16.925063
/sds_env/ningtao/packages-rpms/BUILD/ceph-12.2.7-176-gd9d01ce/src/osd/OSDMap.cc: 4140: FAILED assert(osd_weight.count(i.first))

ceph version 12.2.7-176-gd9d01ce (d9d01ce2176a7a5af5c4bc843b1978530c0b582f) luminous (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x5582eee62440]
2: (OSDMap::calc_pg_upmaps(CephContext*, float, int, std::set&lt;long, std::less&lt;long&gt;, std::allocator&lt;long&gt; > const&, OSDMap::Incremental*)+0xf3f) [0x5582eef5520f]
3: (()+0x2ea004) [0x5582eed0e004]
4: (PyEval_EvalFrameEx()+0x6df0) [0x7f931a0f0cf0]
5: (PyEval_EvalCodeEx()+0x7ed) [0x7f931a0f303d]
6: (PyEval_EvalFrameEx()+0x663c) [0x7f931a0f053c]
7: (PyEval_EvalFrameEx()+0x67bd) [0x7f931a0f06bd]
8: (PyEval_EvalFrameEx()+0x67bd) [0x7f931a0f06bd]
9: (PyEval_EvalCodeEx()+0x7ed) [0x7f931a0f303d]
10: (()+0x70978) [0x7f931a07c978]
11: (PyObject_Call()+0x43) [0x7f931a057a63]
12: (()+0x5aa55) [0x7f931a066a55]
13: (PyObject_Call()+0x43) [0x7f931a057a63]
14: (()+0x4bb45) [0x7f931a057b45]
15: (PyObject_CallMethod()+0xbb) [0x7f931a057e7b]
16: (ActivePyModule::handle_command(std::map&lt;std::string, boost::variant&lt;std::string&lt;bool, long, double, std::vector&lt;std::string, std::allocator&lt;std::string&gt; >, std::vector&lt;long, std::allocator&lt;long&gt; >, std::vector&lt;double, std::allocator&lt;double&gt; > > >, std::less&lt;std::string&gt;, std::allocator&lt;std::pair&lt;std::string const, std::string&lt;bool, long, double, std::vector&lt;std::string, std::allocator&lt;std::string&gt; >, std::vector&lt;long, std::allocator&lt;long&gt; >, std::vector&lt;double, std::allocator&lt;double&gt; > > > > > const&, std::basic_stringstream&lt;char, std::char_traits&lt;char&gt;, std::allocator&lt;char&gt; >*, std::basic_stringstream&lt;char, std::char_traits&lt;char&gt;, std::allocator&lt;char&gt; >)+0x2ed) [0x5582eed1949d]
17: (()+0x2ab4b4) [0x5582eeccf4b4]
18: (FunctionContext::finish(int)+0x2a) [0x5582eece288a]
19: (Context::complete(int)+0x9) [0x5582eecddc79]
20: (Finisher::finisher_thread_entry()+0x198) [0x5582eee613d8]
21: (()+0x7e25) [0x7f9318181e25]
22: (clone()+0x6d) [0x7f931726434d]
NOTE: a copy of the executable, or `objdump -rdS &lt;executable&gt;` is needed to interpret this.

Mgr balancer相关PR(From 12.2.7).txt View (684 Bytes) tao ning, 12/06/2018 09:52 AM

osd.tree.txt View (2.17 KB) tao ning, 12/06/2018 09:52 AM

osd.dump.txt View (5.09 KB) tao ning, 12/06/2018 09:52 AM

mgr.x.log View (49.6 KB) tao ning, 12/06/2018 09:53 AM


Related issues

Copied to Ceph - Backport #37743: luminous: Mgr: OSDMap.cc: 4140: FAILED assert(osd_weight.count(i.first)) Resolved
Copied to Ceph - Backport #37744: mimic: Mgr: OSDMap.cc: 4140: FAILED assert(osd_weight.count(i.first)) Resolved

History

#1 Updated by tao ning 3 months ago

Operation steps:
1. Down and out an OSD;
2. Execute balancer plan

#2 Updated by xie xingguo 3 months ago

I guess it should be fixed by https://github.com/ceph/ceph/pull/22630, can you try that?

#3 Updated by xie xingguo 3 months ago

  • Status changed from New to Feedback

#4 Updated by tao ning 3 months ago

xie xingguo wrote:

I guess it should be fixed by https://github.com/ceph/ceph/pull/22630, can you try that?

I have included this PR, otherwise how could such an assert appear?

#5 Updated by xie xingguo 3 months ago

tao ning wrote:

xie xingguo wrote:

I guess it should be fixed by https://github.com/ceph/ceph/pull/22630, can you try that?

I have included this PR, otherwise how could such an assert appear?

I guess the pool-type is EC? Can you post a complete log? (With log level 20, probably)

#6 Updated by tao ning 3 months ago

xie xingguo wrote:

tao ning wrote:

xie xingguo wrote:

I guess it should be fixed by https://github.com/ceph/ceph/pull/22630, can you try that?

I have included this PR, otherwise how could such an assert appear?

I guess the pool-type is EC? Can you post a complete log?

Yeah, the log might not be there, so let me try to reproduce that

#7 Updated by tao ning 3 months ago

tao ning wrote:

xie xingguo wrote:

tao ning wrote:

xie xingguo wrote:

I guess it should be fixed by https://github.com/ceph/ceph/pull/22630, can you try that?

I have included this PR, otherwise how could such an assert appear?

I guess the pool-type is EC? Can you post a complete log?

Yeah, the log might not be there, so let me try to reproduce that

I found a way to do that, so I just put OSD out, not down

#8 Updated by tao ning 3 months ago

#9 Updated by tao ning 3 months ago

Please see if there is any problem with this modification

@ -4212,10 +4216,12 @ int OSDMap::calc_pg_upmaps(
if (q.second == osd) {
ldout(cct, 10) << " dropping pg_upmap_items " << pg
<< " " << p->second << dendl;
- for (auto i : p->second) {
- pgs_by_osd[i.second].erase(pg);
- pgs_by_osd[i.first].insert(pg);
- }
+ for (auto i : p->second) {
+ if (pgs_by_osd.count(i.second)) {
+ pgs_by_osd[i.second].erase(pg);
+ }
+ pgs_by_osd[i.first].insert(pg);
+ }

#11 Updated by xie xingguo 2 months ago

  • Status changed from Feedback to Pending Backport
  • Backport set to luminous,mimic

#12 Updated by Nathan Cutler 2 months ago

  • Subject changed from luminous:Mgr: OSDMap.cc: 4140: FAILED assert(osd_weight.count(i.first)) to Mgr: OSDMap.cc: 4140: FAILED assert(osd_weight.count(i.first))

#13 Updated by Nathan Cutler 2 months ago

  • Copied to Backport #37743: luminous: Mgr: OSDMap.cc: 4140: FAILED assert(osd_weight.count(i.first)) added

#14 Updated by Nathan Cutler 2 months ago

  • Copied to Backport #37744: mimic: Mgr: OSDMap.cc: 4140: FAILED assert(osd_weight.count(i.first)) added

#15 Updated by Nathan Cutler 23 days ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF