Project

General

Profile

Actions

Bug #54655

open

crash: DispatchQueue::fast_dispatch(boost::intrusive_ptr<Message> const&)

Added by Telemetry Bot about 2 years ago. Updated about 2 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Telemetry
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):

6069af4770da1299ba48e42d3994cdc175c6dd7dab381799f3d16abe17b16a06
f6e468d012d0fc9625d2f5cb53d8a20bed6281ec03638e9c8a057fae48d61c02


Description

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=8679e58416d08221eda3fb4867d684b533f7f5f071324c48d5cacb6750f02020

Sanitized backtrace:

    DispatchQueue::fast_dispatch(boost::intrusive_ptr<Message> const&)
    ProtocolV2::handle_message()
    ProtocolV2::handle_read_frame_dispatch()
    ProtocolV2::_handle_read_frame_epilogue_main()
    ProtocolV2::handle_read_frame_epilogue_main(std::unique_ptr<ceph::buffer::ptr_node, ceph::buffer::ptr_node::disposer>&&, int)
    ProtocolV2::run_continuation(Ct<ProtocolV2>&)
    AsyncConnection::process()
    EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*)

Crash dump sample:
{
    "backtrace": [
        "/lib64/libpthread.so.0(+0x12c20) [0x7f2168e2cc20]",
        "/lib64/librados.so.2(+0x6a778) [0x7f217319e778]",
        "/lib64/librados.so.2(+0xf4e15) [0x7f2173228e15]",
        "/lib64/librados.so.2(+0xf5d0b) [0x7f2173229d0b]",
        "/lib64/librados.so.2(+0xf9a76) [0x7f217322da76]",
        "(DispatchQueue::fast_dispatch(boost::intrusive_ptr<Message> const&)+0x190) [0x7f21699dc4f0]",
        "(ProtocolV2::handle_message()+0x12b6) [0x7f2169abde76]",
        "(ProtocolV2::handle_read_frame_dispatch()+0x258) [0x7f2169acfab8]",
        "(ProtocolV2::_handle_read_frame_epilogue_main()+0x95) [0x7f2169acfbb5]",
        "(ProtocolV2::handle_read_frame_epilogue_main(std::unique_ptr<ceph::buffer::v15_2_0::ptr_node, ceph::buffer::v15_2_0::ptr_node::disposer>&&, int)+0x204) [0x7f2169ad1104]",
        "(ProtocolV2::run_continuation(Ct<ProtocolV2>&)+0x3c) [0x7f2169ab952c]",
        "(AsyncConnection::process()+0x789) [0x7f2169a81a39]",
        "(EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*)+0xcb7) [0x7f2169adbb47]",
        "/usr/lib64/ceph/libceph-common.so.2(+0x5bf05c) [0x7f2169ae205c]",
        "/lib64/libstdc++.so.6(+0xc2ba3) [0x7f2167e52ba3]",
        "/lib64/libpthread.so.0(+0x817f) [0x7f2168e2217f]",
        "clone()" 
    ],
    "ceph_version": "16.2.7",
    "crash_id": "2022-03-11T17:02:35.478370Z_1de64a44-7bf8-4311-b62a-117ebc8e77f3",
    "entity_name": "client.347ac0c88fb6472a5e74d2afcd8246509b974553",
    "os_id": "centos",
    "os_name": "CentOS Stream",
    "os_version": "8",
    "os_version_id": "8",
    "process_name": "radosgw",
    "stack_sig": "6069af4770da1299ba48e42d3994cdc175c6dd7dab381799f3d16abe17b16a06",
    "timestamp": "2022-03-11T17:02:35.478370Z",
    "utsname_machine": "x86_64",
    "utsname_release": "5.4.0-100-generic",
    "utsname_sysname": "Linux",
    "utsname_version": "#113-Ubuntu SMP Thu Feb 3 18:43:29 UTC 2022" 
}

Actions #1

Updated by Telemetry Bot about 2 years ago

  • Crash signature (v1) updated (diff)
  • Crash signature (v2) updated (diff)
  • Affected Versions v16.2.4, v16.2.6, v16.2.7 added
Actions #2

Updated by Mark Kogan about 2 years ago

  • Crash signature (v1) updated (diff)

in a log from an occurance of a similar issue, there was an indication of disconnection from the mon
(errno ENOTCONN 107 /* Transport endpoint is not connected */)

                         vv
   -17> 2022-04-11 10:17:01.208 7f91a77fe700 10 monclient: tick
   -16> 2022-04-11 10:17:04.776 7f919effd700 10 monclient: tick
   -15> 2022-04-11 10:17:09.780 7f91a4ff9700 10 monclient: _renew_subs
   -14> 2022-04-11 10:17:09.780 7f91a4ff9700 10 monclient: _send_mon_message to mon.dell-r630-002 at v2:10.1.8.84:3300/0
   -13> 2022-04-11 10:17:11.208 7f91a77fe700 10 monclient: tick
   -12> 2022-04-11 10:17:14.777 7f919effd700 10 monclient: tick
   -11> 2022-04-11 10:17:21.207 7f91a77fe700 10 monclient: tick
   -10> 2022-04-11 10:17:24.777 7f919effd700 10 monclient: tick
    -9> 2022-04-11 10:17:31.208 7f91a77fe700 10 monclient: tick
                         ^^  stuck 30 sec ?
    -8> 2022-04-11 10:17:33.061 7f91a57fa700 -1 RGWWatcher::handle_error cookie 94328575821408 err (107) Transport endpoint is not connected
#                                                                                                   ^^^
#                                                                       errno.h: #define    ENOTCONN    107    /* Transport endpoint is not connected */
    -7> 2022-04-11 10:17:33.061 7f91a57fa700  2 removed watcher, disabling cache
    -6> 2022-04-11 10:17:33.061 7f91a57fa700 -1 RGWWatcher::handle_error cookie 94328575831040 err (107) Transport endpoint is not connected
    -5> 2022-04-11 10:17:33.284 7f91c2c1ef80  5 asok(0x55ca95292050) unregister_commands sync trace active
    -4> 2022-04-11 10:17:33.284 7f91c2c1ef80  5 asok(0x55ca95292050) unregister_commands sync trace active_short
    -3> 2022-04-11 10:17:33.284 7f91c2c1ef80  5 asok(0x55ca95292050) unregister_commands sync trace history
    -2> 2022-04-11 10:17:33.284 7f91c2c1ef80  5 asok(0x55ca95292050) unregister_commands sync trace show
    -1> 2022-04-11 10:17:33.285 7f91c2c1ef80  5 asok(0x55ca95292050) unregister_command cr dump
     0> 2022-04-11 10:17:33.382 7f91af6f7700 -1 *** Caught signal (Aborted) **
 in thread 7f91af6f7700 thread_name:msgr-worker-1

 ceph version 14.2.11-184.8.2.hotfix.bz2037990.el7cp (4303acac0af62cf645fc49b978bc51daa92a4e66) nautilus (stable)
 1: (()+0xf630) [0x7f91b86dd630]
 2: (gsignal()+0x37) [0x7f91b55bc387]
 3: (abort()+0x148) [0x7f91b55bda78]
 4: (()+0x78ed7) [0x7f91b55feed7]
 5: (()+0x81299) [0x7f91b5607299]
 6: (()+0x94f9a) [0x7f91c2286f9a]
 7: (()+0x93e52) [0x7f91c2285e52]
 8: (()+0xc4b9a) [0x7f91c22b6b9a]
 9: (()+0xcc426) [0x7f91c22be426]
 10: (()+0xcd09b) [0x7f91c22bf09b]
 11: (()+0xea002) [0x7f91c22dc002]
 12: (DispatchQueue::fast_dispatch(boost::intrusive_ptr<Message> const&)+0x69a) [0x7f91b8d6f53a]
 13: (ProtocolV2::handle_message()+0xe99) [0x7f91b8e5abc9]
 14: (ProtocolV2::handle_read_frame_dispatch()+0x240) [0x7f91b8e6af90]
 15: (ProtocolV2::_handle_read_frame_epilogue_main()+0x95) [0x7f91b8e6b045]
 16: (ProtocolV2::handle_read_frame_epilogue_main(std::unique_ptr<ceph::buffer::v14_2_0::ptr_node, ceph::buffer::v14_2_0::ptr_node::disposer>&&, int)+0x90) [0x7f91b8e6c220]
 17: (ProtocolV2::run_continuation(Ct<ProtocolV2>&)+0x34) [0x7f91b8e569c4]
 18: (AsyncConnection::process()+0x186) [0x7f91b8e26d06]
 19: (EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*)+0xa15) [0x7f91b8e76375]
 20: (()+0x592cd7) [0x7f91b8e7ccd7]
 21: (()+0x8338ff) [0x7f91b911d8ff]
 22: (()+0x7ea5) [0x7f91b86d5ea5]
 23: (clone()+0x6d) [0x7f91b568496d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Actions

Also available in: Atom PDF