Actions
Bug #1635
closedosd hit suicide timeout in heartbeat_map thread
% Done:
0%
Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
This was while thrashing with radosbench, during peering, with osds 3 and 6 marked out.
From teuthology:~teuthworker/archive/nightly_coverage_2011-10-19/680/remote/ubuntu@sepia32.ceph.dreamhost.com/log/osd.0.log.gz:
2011-10-19 13:13:53.070817 7ff4bf2a7700 heartbeat_map is_healthy 'OSD::op_tp thread 0x7ff4b208b700' had suicide timed out after 300 common/HeartbeatMap.cc: In function 'bool ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d*, const char*, time_t)', in thread '0x7ff4bf2a7700' common/HeartbeatMap.cc: 78: FAILED assert(0 == "hit suicide timeout") ceph version 0.36-327-g3e92aac (commit:3e92aace21ecc766f14ac5a2c6377570988f1a3b) 1: (ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d*, char const*, long)+0x3ad) [0x7808cd] 2: (ceph::HeartbeatMap::is_healthy()+0x8f) [0x78202f] 3: (ceph::HeartbeatMap::check_touch_file()+0x28) [0x782438] 4: (CephContextServiceThread::entry()+0x77) [0x672d57] 5: (Thread::_entry_func(void*)+0x12) [0x615372] 6: (()+0x7971) [0x7ff4c0d17971] 7: (clone()+0x6d) [0x7ff4bf5a792d]
Actions