Project

General

Profile

Actions

Bug #46266

open

Monitor crashed in creating pool in CrushTester::test_with_fork()

Added by Seena Fallah almost 4 years ago. Updated about 2 years ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):

275307d556e4147d2ad5a7a74064a399e0012d97a865818d4bc17099de946b81
28a321ddf7d0469a896f796efb595606803be806190be11dc931244282ea557f
3a900e112239b9f0ee9a5c0f2accebd4132ce2613d139114c86865629f7b5817
45a2f7bde520435b99e6b83ee775e35de18732b48ed56c14dd86c9d17403afb0
53ae5a7747df303d95f8f28697a631851e080413cebe024867883f015f182b06
59a9b122a0a1e02a2dce55c98c55e70a35d400e2dbc3b0e31d579ca6ae0b1566
5d341195e849d1b8fc568366d6076c61e65a91823896ea2fd1e408ed36f3f529
7a8bd7fbc442ee0139b78523e37d7d9450b5af67da77b679cfccf88299426d41
d6b6f43e0c31315c6493798edbb349f4cfb759ecb7984e8bf203ce12d7d3e312
f23756652603548a024c95c1877e753effe7bdd1473676104f3207deaac74488
fb70f56d305d105eaab078b0d53042c75054e5d4ac05cbf436ded29e7b18efea


Description

Hi. I was creating a new pool and one of my monitors crashed.

Jun 30 01:10:29 afra-mon2 ceph-mon[231520]: /build/ceph-14.2.9/src/common/fork_function.h: In function 'int fork_function(int, std::ostream&, std::function<signed char()>)' thread 7fb044366700 time 2020-06-30 01
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]: /build/ceph-14.2.9/src/common/fork_function.h: 34: FAILED ceph_assert((*__errno_location ()) == 4)
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]:  ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) nautilus (stable)
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]:  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x7fb052913e22]
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]:  2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7fb052913ffd]
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]:  3: (CrushTester::test_with_fork(int)+0x81d) [0x7fb052e8e3dd]
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]:  4: (OSDMonitor::prepare_new_pool(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, int, std::__cxx11::basic_string<char, std::char_tra
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]:  5: (OSDMonitor::prepare_command_impl(boost::intrusive_ptr<MonOpRequest>, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, boo
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]:  6: (OSDMonitor::prepare_command(boost::intrusive_ptr<MonOpRequest>)+0x122) [0x55c389edccb2]
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]:  7: (OSDMonitor::prepare_update(boost::intrusive_ptr<MonOpRequest>)+0x193) [0x55c389ee0fc3]
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]:  8: (PaxosService::dispatch(boost::intrusive_ptr<MonOpRequest>)+0x9be) [0x55c389e63d3e]
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]:  9: (PaxosService::C_RetryMessage::_finish(int)+0x64) [0x55c389dd79c4]
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]:  10: (C_MonOp::finish(int)+0x45) [0x55c389d7d0d5]
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]:  11: (Context::complete(int)+0x9) [0x55c389d7a449]
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]:  12: (void finish_contexts<std::__cxx11::list<Context*, std::allocator<Context*> > >(CephContext*, std::__cxx11::list<Context*, std::allocator<Context*> >&, int)+0xa8)
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]:  13: (Paxos::finish_round()+0x9b) [0x55c389e5b38b]
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]:  14: (Paxos::handle_last(boost::intrusive_ptr<MonOpRequest>)+0xbff) [0x55c389e5c56f]
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]:  15: (Paxos::dispatch(boost::intrusive_ptr<MonOpRequest>)+0x24b) [0x55c389e5cffb]
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]:  16: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0x15c5) [0x55c389d743c5]
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]:  17: (Monitor::_ms_dispatch(Message*)+0x4ca) [0x55c389d74a0a]
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]:  18: (Monitor::ms_dispatch(Message*)+0x26) [0x55c389da4bd6]
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]:  19: (Dispatcher::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x26) [0x55c389da0bb6]
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]:  20: (DispatchQueue::entry()+0x1219) [0x7fb052b3e539]
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]:  21: (DispatchQueue::DispatchThread::entry()+0xd) [0x7fb052bee93d]
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]:  22: (()+0x76db) [0x7fb05178b6db]
Jun 30 01:10:29 afra-mon2 ceph-mon[231520]:  23: (clone()+0x3f) [0x7fb05097188f]


Related issues 2 (1 open1 closed)

Related to RADOS - Bug #57782: [mon] high cpu usage by fn_monstore threadFix Under Review

Actions
Has duplicate RADOS - Bug #51877: crash: int fork_function(int, std::ostream&, std::function<signed char()>): assert((*__errno_location ()) == 4)Duplicate

Actions
Actions #1

Updated by Seena Fallah over 3 years ago

This happens again :(

Actions #2

Updated by Greg Farnum almost 3 years ago

  • Project changed from Ceph to RADOS
  • Category deleted (Monitor)
Actions #3

Updated by Neha Ojha over 2 years ago

  • Has duplicate Bug #51877: crash: int fork_function(int, std::ostream&, std::function<signed char()>): assert((*__errno_location ()) == 4) added
Actions #4

Updated by Neha Ojha over 2 years ago

  • Subject changed from Mnitor crashed in creating pool to Mnitor crashed in creating pool in CrushTester::test_with_fork()
Actions #5

Updated by Neha Ojha over 2 years ago

Can you share you crushmap? The crash is in CrushTester::test_with_fork() which could mean there is an issue with the crushmap.

Actions #6

Updated by Neha Ojha over 2 years ago

  • Status changed from New to Need More Info
Actions #7

Updated by Neha Ojha over 2 years ago

  • Subject changed from Mnitor crashed in creating pool in CrushTester::test_with_fork() to Monitor crashed in creating pool in CrushTester::test_with_fork()
Actions #8

Updated by Telemetry Bot over 2 years ago

  • Crash signature (v1) updated (diff)
  • Crash signature (v2) updated (diff)
  • Affected Versions v15.2.1, v15.2.13, v15.2.4, v15.2.5, v15.2.6, v15.2.7, v15.2.8, v15.2.9 added

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=d6b6f43e0c31315c6493798edbb349f4cfb759ecb7984e8bf203ce12d7d3e312

Assert condition: (*__errno_location ()) == 4
Assert function: int fork_function(int, std::ostream&, std::function<signed char()>)

Sanitized backtrace:

    CrushTester::test_with_fork(int)
    OSDMonitor::prepare_new_pool(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned int, unsigned int, unsigned int, unsigned long, unsigned long, float, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned int, unsigned long, OSDMonitor::FastReadType, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::ostream*)
    OSDMonitor::prepare_command_impl(boost::intrusive_ptr<MonOpRequest>, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, boost::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, double, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::vector<long, std::allocator<long> >, std::vector<double, std::allocator<double> > >, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, boost::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, double, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::vector<long, std::allocator<long> >, std::vector<double, std::allocator<double> > > > > > const&)
    OSDMonitor::prepare_command(boost::intrusive_ptr<MonOpRequest>)
    OSDMonitor::prepare_update(boost::intrusive_ptr<MonOpRequest>)
    PaxosService::dispatch(boost::intrusive_ptr<MonOpRequest>)
    Monitor::handle_command(boost::intrusive_ptr<MonOpRequest>)
    Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)
    Monitor::_ms_dispatch(Message*)
    Dispatcher::ms_dispatch2(boost::intrusive_ptr<Message> const&)
    DispatchQueue::entry()
    DispatchQueue::DispatchThread::entry()
    clone()

Crash dump sample:
{
    "assert_condition": "(*__errno_location ()) == 4",
    "assert_file": "common/fork_function.h",
    "assert_func": "int fork_function(int, std::ostream&, std::function<signed char()>)",
    "assert_line": 36,
    "assert_msg": "common/fork_function.h: In function 'int fork_function(int, std::ostream&, std::function<signed char()>)' thread 7fcbbdf3d700 time 2021-08-09T17:28:57.198441+0800\ncommon/fork_function.h: 36: FAILED ceph_assert((*__errno_location ()) == 4)",
    "assert_thread_name": "ms_dispatch",
    "backtrace": [
        "(()+0x12dd0) [0x7fcbca3c1dd0]",
        "(gsignal()+0x10f) [0x7fcbc902670f]",
        "(abort()+0x127) [0x7fcbc9010b25]",
        "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x7fcbcc8c1d61]",
        "(()+0x27af2a) [0x7fcbcc8c1f2a]",
        "(CrushTester::test_with_fork(int)+0x796) [0x7fcbcce38ea6]",
        "(OSDMonitor::prepare_new_pool(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned int, unsigned int, unsigned int, unsigned long, unsigned long, float, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned int, unsigned long, OSDMonitor::FastReadType, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::ostream*)+0x42d) [0x555ef317742d]",
        "(OSDMonitor::prepare_command_impl(boost::intrusive_ptr<MonOpRequest>, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, boost::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, double, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::vector<long, std::allocator<long> >, std::vector<double, std::allocator<double> > >, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, boost::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, double, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::vector<long, std::allocator<long> >, std::vector<double, std::allocator<double> > > > > > const&)+0x19d10) [0x555ef3199ae0]",
        "(OSDMonitor::prepare_command(boost::intrusive_ptr<MonOpRequest>)+0xf3) [0x555ef31a32d3]",
        "(OSDMonitor::prepare_update(boost::intrusive_ptr<MonOpRequest>)+0x373) [0x555ef31ab6f3]",
        "(PaxosService::dispatch(boost::intrusive_ptr<MonOpRequest>)+0xa6d) [0x555ef3132e0d]",
        "(Monitor::handle_command(boost::intrusive_ptr<MonOpRequest>)+0x4fc5) [0x555ef3032c85]",
        "(Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0xb36) [0x555ef3034e66]",
        "(Monitor::_ms_dispatch(Message*)+0x6a6) [0x555ef30361e6]",
        "(Dispatcher::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x5c) [0x555ef306514c]",
        "(DispatchQueue::entry()+0x126a) [0x7fcbccae273a]",
        "(DispatchQueue::DispatchThread::entry()+0x11) [0x7fcbccb85771]",
        "(()+0x82de) [0x7fcbca3b72de]",
        "(clone()+0x43) [0x7fcbc90eae83]" 
    ],
    "ceph_version": "15.2.13",
    "crash_id": "2021-08-09T09:28:57.221704Z_072951df-87c2-4d50-b0b6-c36e878a89a4",
    "entity_name": "mon.cc2ea5dd4e702b94d398013f71628774f641093f",
    "os_id": "centos",
    "os_name": "CentOS Linux",
    "os_version": "8 (Core)",
    "os_version_id": "8",
    "process_name": "ceph-mon",
    "stack_sig": "7a8bd7fbc442ee0139b78523e37d7d9450b5af67da77b679cfccf88299426d41",
    "timestamp": "2021-08-09T09:28:57.221704Z",
    "utsname_machine": "x86_64",
    "utsname_release": "4.18.0-193.6.3.el8_2.x86_64",
    "utsname_sysname": "Linux",
    "utsname_version": "#1 SMP Wed Jun 10 11:09:32 UTC 2020" 
}

Actions #9

Updated by Telemetry Bot about 2 years ago

  • Crash signature (v1) updated (diff)
  • Affected Versions v15.2.15 added
Actions #10

Updated by Telemetry Bot about 2 years ago

  • Crash signature (v1) updated (diff)
  • Crash signature (v2) updated (diff)
Actions #11

Updated by Radoslaw Zarzynski over 1 year ago

  • Related to Bug #57782: [mon] high cpu usage by fn_monstore thread added
Actions

Also available in: Atom PDF