Project

General

Profile

Bug #9096

Updated by Loïc Dachary over 9 years ago

It can be reproduced by running a few times (less than 5) *qa/workunits/cephtool/test.sh -t mon_osd*. It will eventually crash one or more OSD, all with the following stack trace: 


  

 <pre> 
  ceph version 0.83-655-ga006fe4 (a006fe4a7df3e0020325a5d9cf2956545b4cac47) 
  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x95) [0x18e9cb1] 
  2: (Mutex::Lock(bool)+0x14e) [0x188eb78] 
  3: "sources":https://github.com/ceph/ceph/blob/0479db8c2b7e67d990fc3edac73a39cace1ff9f2/src/osd/OSD.cc#L6786 (OSD::require_same_peer_instance(std::tr1::shared_ptr<OpRequest>&, std::tr1::shared_ptr<OSDMap const>&)+0x396) [0x12e55f8] *acquire lock again* 
  4: "sources":https://github.com/ceph/ceph/blob/0479db8c2b7e67d990fc3edac73a39cace1ff9f2/src/osd/OSD.cc#L6805 (OSD::require_up_osd_peer(std::tr1::shared_ptr<OpRequest>&, std::tr1::shared_ptr<OSDMap const>&, unsigned int)+0x8c) [0x12e5760] 
  5: "sources":https://github.com/ceph/ceph/blob/0479db8c2b7e67d990fc3edac73a39cace1ff9f2/src/osd/OSD.cc#L8146 (void OSD::handle_replica_op<MOSDSubOp, 76>(std::tr1::shared_ptr<OpRequest>&, std::tr1::shared_ptr<OSDMap const>&)+0x349) [0x132eda3] 
  6: "sources":https://github.com/ceph/ceph/blob/0479db8c2b7e67d990fc3edac73a39cace1ff9f2/src/osd/OSD.cc#L5754 (OSD::dispatch_op_fast(std::tr1::shared_ptr<OpRequest>&, std::tr1::shared_ptr<OSDMap const>&)+0x198) [0x12db4e8] 
  7: "sources":https://github.com/ceph/ceph/blob/0479db8c2b7e67d990fc3edac73a39cace1ff9f2/src/osd/OSD.cc#L5415 (OSD::dispatch_session_waiting(OSD::Session*, std::tr1::shared_ptr<OSDMap const>)+0x11f) [0x12d9085] *assert lock is in place* 
  8: "sources":https://github.com/ceph/ceph/blob/0479db8c2b7e67d990fc3edac73a39cace1ff9f2/src/osd/OSD.cc#L5518 OSD::ms_fast_dispatch(Message*) *acquire lock* (OSD::ms_fast_dispatch(Message*)+0x14a) [0x12d9a76] 
  9: (Messenger::ms_fast_dispatch(Message*)+0x74) [0x19e76a8] 
  10: (DispatchQueue::fast_dispatch(Message*)+0x3e) [0x19e6306] 
  11: (Pipe::reader()+0x1a5d) [0x1a07211] 
  12: (Pipe::Reader::entry()+0x1c) [0x1a0e63e] 
  13: (Thread::entry_wrapper()+0x79) [0x18d2451] 
  14: (Thread::_entry_func(void*)+0x18) [0x18d23ce] 
  15: (()+0x8182) [0x7fdf6d055182] 
  16: (clone()+0x6d) [0x7fdf6b64330d] 
  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 

 
 </pre> 

Back