Project

General

Profile

Bug #22056

segv in OSDMap::calc_pg_upmaps from balancer

Added by Sage Weil about 1 year ago. Updated 3 months ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
-
Start date:
11/06/2017
Due date:
% Done:

0%

Source:
Tags:
Backport:
mimic, luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

(gdb) bt
#0  0x00007fb53adfc269 in raise (sig=sig@entry=11) at ../sysdeps/unix/sysv/linux/pt-raise.c:35
#1  0x00005637937d2ece in reraise_fatal (signum=11) at /build/ceph-12.2.1-648-g3f7eadb/src/global/signal_handler.cc:74
#2  handle_fatal_signal (signum=11) at /build/ceph-12.2.1-648-g3f7eadb/src/global/signal_handler.cc:138
#3  <signal handler called>
#4  0x00005637938fff8f in OSDMap::_apply_upmap (this=this@entry=0x7fb52bf36d10, pi=..., raw_pg=..., raw=raw@entry=0x7fb52bf368a0) at /build/ceph-12.2.1-648-g3f7eadb/src/osd/OSDMap.cc:1981
#5  0x0000563793904987 in OSDMap::_pg_to_up_acting_osds (this=0x7fb52bf36d10, pg=..., up=up@entry=0x7fb52bf36bd0, up_primary=up_primary@entry=0x0, acting=acting@entry=0x0, acting_primary=acting_primary@entry=0x0, raw_pg_to_pg=true)
    at /build/ceph-12.2.1-648-g3f7eadb/src/osd/OSDMap.cc:2164
#6  0x0000563793915ed5 in OSDMap::pg_to_up_acting_osds (acting_primary=0x0, acting=0x0, up_primary=0x0, up=0x7fb52bf36bd0, pg=..., this=0x7fb52bf36d10) at /build/ceph-12.2.1-648-g3f7eadb/src/osd/OSDMap.h:1128
#7  OSDMap::calc_pg_upmaps (this=<optimized out>, cct=<optimized out>, max_deviation_ratio=<optimized out>, max=<optimized out>, only_pools_orig=std::set with 0 elements, pending_inc=<optimized out>) at /build/ceph-12.2.1-648-g3f7eadb/src/osd/OSDMap.cc:3859
#8  0x00005637936ba022 in osdmap_calc_pg_upmaps (self=0x7fb52a6fc200, args=<optimized out>) at /build/ceph-12.2.1-648-g3f7eadb/src/mgr/PyOSDMap.cc:135
#9  0x00007fb53bee5751 in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#10 0x00007fb53c00e01c in PyEval_EvalCodeEx () from /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#11 0x00007fb53bee3cfd in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#12 0x00007fb53bee3e24 in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#13 0x00007fb53bee3e24 in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#14 0x00007fb53c00e01c in PyEval_EvalCodeEx () from /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#15 0x00007fb53bf642e0 in ?? () from /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#16 0x00007fb53bf371e3 in PyObject_Call () from /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#17 0x00007fb53bfab31c in ?? () from /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#18 0x00007fb53bf371e3 in PyObject_Call () from /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0

the log has
    -9> 2017-11-06 15:18:49.455111 7fb52bf39700  4 mgr[balancer] Mode upmap, max misplaced 0.050000
    -8> 2017-11-06 15:18:49.455175 7fb52bf39700 20 mgr[balancer] unknown 0.000000 degraded 0.000000 inactive 0.000000 misplaced 0
    -7> 2017-11-06 15:18:49.455221 7fb52bf39700  4 mgr[balancer] do_upmap
    -6> 2017-11-06 15:18:49.455234 7fb52bf39700  4 mgr get_config get_configkey: mgr/balancer/upmap_max_iterations
    -5> 2017-11-06 15:18:49.455237 7fb52bf39700  4 ceph_config_get upmap_max_iterations not found 
    -4> 2017-11-06 15:18:49.455242 7fb52bf39700  4 mgr get_config get_configkey: mgr/balancer/upmap_max_deviation
    -3> 2017-11-06 15:18:49.455243 7fb52bf39700  4 ceph_config_get upmap_max_deviation not found 
    -2> 2017-11-06 15:18:49.455300 7fb52bf39700  4 mgr[balancer] pools ['unique_pool_0', 'rbd']
    -1> 2017-11-06 15:18:49.455315 7fb52bf39700 10 osdmap_calc_pg_upmaps osdmap 0x56379f69d100 inc 0x56379f8e9000 max_deviation 0.0
1 max_iterations 10
     0> 2017-11-06 15:18:49.457405 7fb52bf39700 -1 *** Caught signal (Segmentation fault) **
 in thread 7fb52bf39700 thread_name:balancer

but unfortunately debug osd was 1/5 on the mgr, so i can't see what calc_pg_upmaps was doing.

/a/sage-2017-11-06_04:25:20-rados-wip-sage2-testing-2017-11-05-0042-distro-basic-smithi/1820262


Related issues

Copied to Ceph - Backport #26933: mimic: segv in OSDMap::calc_pg_upmaps from balancer Resolved
Copied to Ceph - Backport #26934: luminous: segv in OSDMap::calc_pg_upmaps from balancer Resolved

History

#1 Updated by Bryan Stillwell 7 months ago

I believe I'm seeing this problem pretty consistently on my home cluster. The ceph-mgr keeps dying with a very similar backtrace at least:

-10> 2018-05-17 15:38:15.571835 7fcc48d69700  4 mgr[balancer] Optimize plan auto_2018-05-17_21:38:15
-9> 2018-05-17 15:38:15.571887 7fcc48d69700 4 mgr get_config get_configkey: mgr/balancer/mode
-8> 2018-05-17 15:38:15.571903 7fcc48d69700 4 mgr get_config get_configkey: mgr/balancer/max_misplaced
-7> 2018-05-17 15:38:15.571985 7fcc48d69700 4 mgr[balancer] Mode upmap, max misplaced 0.800000
-6> 2018-05-17 15:38:15.572414 7fcc48d69700 4 mgr[balancer] do_upmap
-5> 2018-05-17 15:38:15.572441 7fcc48d69700 4 mgr get_config get_configkey: mgr/balancer/upmap_max_iterations
-4> 2018-05-17 15:38:15.572448 7fcc48d69700 4 ceph_config_get upmap_max_iterations not found
-3> 2018-05-17 15:38:15.572454 7fcc48d69700 4 mgr get_config get_configkey: mgr/balancer/upmap_max_deviation
-2> 2018-05-17 15:38:15.572459 7fcc48d69700 4 ceph_config_get upmap_max_deviation not found
-1> 2018-05-17 15:38:15.572597 7fcc48d69700 4 mgr[balancer] pools ['cephfs_data_ec42', 'cephfs_metadata', 'rbd', 'cephfs_data']
0> 2018-05-17 15:38:15.577347 7fcc48d69700 -1 ** Caught signal (Segmentation fault) *
in thread 7fcc48d69700 thread_name:balancer
ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous (stable)
1: (()+0x40bfa4) [0x5627ec0c0fa4]
2: (()+0x12890) [0x7fcc635e2890]
3: (OSDMap::_apply_upmap(pg_pool_t const&, pg_t, std::vector&lt;int, std::allocator&lt;int&gt; >) const+0x1af) [0x5627ec1eeb8f]
4: (OSDMap::_pg_to_up_acting_osds(pg_t const&, std::vector&lt;int, std::allocator&lt;int&gt; >
, int*, std::vector&lt;int, std::allocator&lt;int&gt; >, int, bool) const+0x257) [0x5627ec1f3607]
5: (OSDMap::calc_pg_upmaps(CephContext*, float, int, std::set&lt;long, std::less&lt;long&gt;, std::allocator&lt;long&gt; > const&, OSDMap::Incremental*)+0x367) [0x5627ec204bd7]
6: (()+0x2ef858) [0x5627ebfa4858]
7: (PyEval_EvalFrameEx()+0x8010) [0x7fcc646b61d0]
8: (PyEval_EvalCodeEx()+0x7d8) [0x7fcc647e6278]
9: (PyEval_EvalFrameEx()+0x5bf6) [0x7fcc646b3db6]
10: (PyEval_EvalFrameEx()+0x8b5b) [0x7fcc646b6d1b]
11: (PyEval_EvalFrameEx()+0x8b5b) [0x7fcc646b6d1b]
12: (PyEval_EvalCodeEx()+0x7d8) [0x7fcc647e6278]
13: (()+0x1645f9) [0x7fcc6476b5f9]
14: (PyObject_Call()+0x43) [0x7fcc6465b333]
15: (()+0x1abd1c) [0x7fcc647b2d1c]
16: (PyObject_Call()+0x43) [0x7fcc6465b333]
17: (PyObject_CallMethod()+0xc8) [0x7fcc6477fc78]
18: (PyModuleRunner::serve()+0x5c) [0x5627ebfa1a3c]
19: (PyModuleRunner::PyModuleRunnerThread::entry()+0x1b8) [0x5627ebfa2248]
20: (()+0x76db) [0x7fcc635d76db]
21: (clone()+0x3f) [0x7fcc6258988f]

I'm also seeing behavior like this on a cluster with just 13 OSDs:

  1. ceph balancer execute run20180517-2
    Error ENOENT: osd.1924087472 does not exist

#2 Updated by Bryan Stillwell 7 months ago

Turned up the debugging and now I'm seeing:

-9> 2018-05-17 15:41:54.498553 7f5016e80700 10  osd.12 move 14
-8> 2018-05-17 15:41:54.498564 7f5016e80700 20 already remapped 2.4
-7> 2018-05-17 15:41:54.498566 7f5016e80700 20 already remapped 2.b
-6> 2018-05-17 15:41:54.498567 7f5016e80700 20 already remapped 2.f
-5> 2018-05-17 15:41:54.498568 7f5016e80700 20 already remapped 2.16
-4> 2018-05-17 15:41:54.498569 7f5016e80700 20 already remapped 2.1c
-3> 2018-05-17 15:41:54.498570 7f5016e80700 10 trying 2.20
-2> 2018-05-17 15:41:54.498602 7f5016e80700 10 2.20 [9,12,6] -> [8,-832587808,22020]
-1> 2018-05-17 15:41:54.498606 7f5016e80700 10 2.20 pg_upmap_items [9,8,12,-832587808,6,22020]
0> 2018-05-17 15:41:54.502708 7f5016e80700 -1 ** Caught signal (Segmentation fault) *
in thread 7f5016e80700 thread_name:mgr-fin

#3 Updated by Brad Hubbard 6 months ago

  • Assignee set to Brad Hubbard

Taking ownership of this since I own a downstream BZ and trying to reproduce.

#4 Updated by Brad Hubbard 4 months ago

(gdb) f 
#4  0x00005630c4885bfd in OSDMap::_apply_upmap (this=this@entry=0x7fc69bd24b90, pi=..., raw_pg=..., raw=raw@entry=0x7fc69bd24730) at /usr/src/debug/ceph-12.2.4/src/osd/OSDMap.cc:1981
1981                  osd_weight[r.second] == 0)) {
(gdb) l
1976            }
1977            // ignore mapping if target is marked out (or invalid osd id)
1978            if (osd == r.first &&
1979                pos < 0 &&
1980                !(r.second != CRUSH_ITEM_NONE && r.second < max_osd &&
1981                  osd_weight[r.second] == 0)) {
1982              pos = i;
1983            }
1984          }
1985          if (!exists && pos >= 0) {

So the assumption is we crashed due to an invalid offset into osd_weight which is defined as follows.

mempool::osdmap::vector<__u32>   osd_weight;

Let's look at that offset.

(gdb) x/i $pc
=> 0x5630c4885bfd <OSDMap::_apply_upmap(pg_pool_t const&, pg_t, std::vector<int, std::allocator<int> >*) const+381>:    mov    (%rdx,%r12,1),%edx
(gdb) i reg r12
r12            0xffffffff3ba07c00       -3294594048
(gdb) p sizeof(__u32)
$65 = 4
(gdb) !echo -3294594048/4|bc -iq
-3294594048/4
-823648512

So where did 'r' come from?

(gdb) l 1963,1967
1963      auto q = pg_upmap_items.find(pg);
1964      if (q != pg_upmap_items.end()) {
1965        // NOTE: this approach does not allow a bidirectional swap,
1966        // e.g., [[1,2],[2,1]] applied to [0,1,2] -> [0,2,1].
1967        for (auto& r : q->second) {
(gdb) p pg
$58 = {
  m_pool = 11,
  m_seed = 17,
  m_preferred = -1,
  static calc_name_buf_size = 36 '$'
}

So we want to look at the object in pg_upmap_items that has m_seed = 17.

(gdb) p pg_upmap_items
...
  [{
    m_pool = 11, 
    m_seed = 17, 
    m_preferred = -1, 
    static calc_name_buf_size = 36 '$'
  }] = std::vector of length 3, capacity 4 = {{
      first = 24, 
      second = 31
    }, {
      first = 25, 
      second = -823648512
    }, {
      first = 0, 
      second = 22064
    }},
...

Bingo! There's the magic number -823648512 and that value is what caused the crash. The fix for the crash in calc_pg_upmaps should be simply to test for a negative number and I'll create a separate issue for the balancer sending negative pg_upmap_item values.

#5 Updated by Brad Hubbard 4 months ago

I can create a couple of almost identical crashes in master by changing unittest_osdmap (I'll change the test so it tests a negative number in the future).

Running that segfaults and the core shows the following.

(gdb) f 4
#4  0x00007fee1de3396f in OSDMap::_apply_upmap (this=this@entry=0x559dd4dfe3a0, pi=..., raw_pg=..., raw=raw@entry=0x7fffd8687cc0) at /home/brad/working/src/ceph4/src/osd/OSDMap.cc:2211
2211          if (osd != CRUSH_ITEM_NONE && osd < max_osd && osd_weight[osd] == 0) {
(gdb) x/i $pc
=> 0x7fee1de3396f <OSDMap::_apply_upmap(pg_pool_t const&, pg_t, std::vector<int, std::allocator<int> >*) const+143>:    mov    (%rcx,%rdx,4),%ecx
(gdb) i reg rdx
rdx            0xffffffffcee81f00       -823648512
(gdb) p pg_upmap
$1 = std::map with 1 element = {[{m_pool = 3, m_seed = 0, static calc_name_buf_size = 36 '$'}] = std::vector of length 2, capacity 2 = {3, -823648512}}

Once I fix that code to check for a negative I get the original segfault seen in this tracker.

So there are two places in _apply_upmap we need to check for negative values.

#6 Updated by Brad Hubbard 4 months ago

  • Status changed from Need More Info to In Progress
  • Priority changed from Normal to High
  • Backport set to mimic, luminous

#7 Updated by Sage Weil 4 months ago

  • Status changed from In Progress to Pending Backport

#8 Updated by Nathan Cutler 4 months ago

  • Copied to Backport #26933: mimic: segv in OSDMap::calc_pg_upmaps from balancer added

#9 Updated by Nathan Cutler 4 months ago

  • Copied to Backport #26934: luminous: segv in OSDMap::calc_pg_upmaps from balancer added

#10 Updated by Nathan Cutler 3 months ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF