Bug #61874
mgr: DaemonServer::ms_handle_authentication acquires daemon locks
% Done:
100%
Source:
Development
Tags:
backport_processed
Backport:
reef,quincy,pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Description
This method can blocks with the entire EventCenter lock:
Thread 3 (Thread 0x7feeff18d700 (LWP 822150)): #0 0x00007fef0339081d in __lll_lock_wait () from target:/lib64/libpthread.so.0 #1 0x00007fef03389ac9 in pthread_mutex_lock () from target:/lib64/libpthread.so.0 #2 0x00005610fa986d17 in std::mutex::lock() () #3 0x00005610fa9dee2c in DaemonServer::ms_handle_authentication(Connection*) () #4 0x00007fef04906e55 in MonClient::handle_auth_request(Connection*, AuthConnectionMeta*, bool, unsigned int, ceph::buffer::v15_2_0::list const&, ceph::buffer::v15_2_0::list*) () from target:/usr/lib64/ceph/libceph-common.so.2 #5 0x00007fef0489165f in ProtocolV2::_handle_auth_request(ceph::buffer::v15_2_0::list&, bool) () from target:/usr/lib64/ceph/libceph-common.so.2 #6 0x00007fef0489261e in ProtocolV2::handle_auth_request_more(ceph::buffer::v15_2_0::list&) () from target:/usr/lib64/ceph/libceph-common.so.2 #7 0x00007fef0489b0c3 in ProtocolV2::handle_frame_payload() () from target:/usr/lib64/ceph/libceph-common.so.2 #8 0x00007fef0489b380 in ProtocolV2::handle_read_frame_dispatch() () from target:/usr/lib64/ceph/libceph-common.so.2 #9 0x00007fef0489b575 in ProtocolV2::_handle_read_frame_epilogue_main() () from target:/usr/lib64/ceph/libceph-common.so.2 #10 0x00007fef0489b622 in ProtocolV2::_handle_read_frame_segment() () from target:/usr/lib64/ceph/libceph-common.so.2 #11 0x00007fef0489c781 in ProtocolV2::handle_read_frame_segment(std::unique_ptr<ceph::buffer::v15_2_0::ptr_node, ceph::buffer::v15_2_0::ptr_node::disposer>&&, int) () from target:/usr/lib64/ceph/libceph-common.so.2 #12 0x00007fef04884eec in ProtocolV2::run_continuation(Ct<ProtocolV2>&) () from target:/usr/lib64/ceph/libceph-common.so.2 #13 0x00007fef0484d3f9 in AsyncConnection::process() () from target:/usr/lib64/ceph/libceph-common.so.2 #14 0x00007fef048a7507 in EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*) () from target:/usr/lib64/ceph/libceph-common.so.2 #15 0x00007fef048ada1c in std::_Function_handler<void (), NetworkStack::add_thread(unsigned int)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from target:/usr/lib64/ceph/libceph-common.so.2 #16 0x00007fef027c2ba3 in execute_native_thread_routine () from target:/lib64/libstdc++.so.6 #17 0x00007fef033871cf in start_thread () from target:/lib64/libpthread.so.0 #18 0x00007fef01ddadd3 in clone () from target:/lib64/libc.so.6
If there is a weak deadlock on DaemonServer::lock, the entire messenger hangs. This can result in real deadlock like #61869.
In general, these fast messenger methods (like ::ms_fast_dispatch) must not acquire any locks.
Related issues
History
#1 Updated by Patrick Donnelly 5 months ago
- Category set to ceph-mgr
- Status changed from In Progress to Fix Under Review
- Pull request ID set to 52292
#2 Updated by Patrick Donnelly 5 months ago
- Related to Bug #61869: pybind/cephfs: holds GIL during rmdir added
#3 Updated by Patrick Donnelly 3 months ago
- Status changed from Fix Under Review to Pending Backport
#4 Updated by Backport Bot 3 months ago
- Copied to Backport #62607: quincy: mgr: DaemonServer::ms_handle_authentication acquires daemon locks added
#5 Updated by Backport Bot 3 months ago
- Copied to Backport #62608: pacific: mgr: DaemonServer::ms_handle_authentication acquires daemon locks added
#6 Updated by Backport Bot 3 months ago
- Copied to Backport #62609: reef: mgr: DaemonServer::ms_handle_authentication acquires daemon locks added
#7 Updated by Backport Bot 3 months ago
- Tags set to backport_processed
#8 Updated by Konstantin Shalygin about 1 month ago
- Status changed from Pending Backport to Resolved
- % Done changed from 0 to 100