Project

General

Profile

Actions

Bug #13749

closed

possible osd messenger deadlock "hit suicide timeout" in rados-infernalis-distro-basic-multi

Added by Yuri Weinstein over 8 years ago. Updated over 7 years ago.

Status:
Can't reproduce
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rados
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Run: http://pulpito.ceph.com/teuthology-2015-11-04_21:00:09-rados-infernalis-distro-basic-multi/
Job: 1138922
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2015-11-04_21:00:09-rados-infernalis-distro-basic-multi/1138922/teuthology.log

2015-11-09T01:03:30.989 INFO:tasks.ceph.osd.0.plana44.stderr:2015-11-09 01:03:30.980798 7fe932cdb700 -1 osd.0 54 heartbeat_check: no reply from osd.5 since back 2015-11-09 01:01:23.162857 front 2015-11-09 01:01:23.162857 (cutoff 2015-11-09 01:03:10.980796)
2015-11-09T01:03:32.782 INFO:teuthology.orchestra.run.plana44:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph osd dump --format=json'
2015-11-09T01:03:33.890 INFO:tasks.ceph.osd.0.plana44.stderr:2015-11-09 01:03:33.881198 7fe932cdb700 -1 osd.0 54 heartbeat_check: no reply from osd.5 since back 2015-11-09 01:01:23.162857 front 2015-11-09 01:01:23.162857 (cutoff 2015-11-09 01:03:13.881196)
2015-11-09T01:03:34.990 INFO:tasks.ceph.osd.0.plana44.stderr:2015-11-09 01:03:34.981612 7fe932cdb700 -1 osd.0 54 heartbeat_check: no reply from osd.5 since back 2015-11-09 01:01:23.162857 front 2015-11-09 01:01:23.162857 (cutoff 2015-11-09 01:03:14.981611)
2015-11-09T01:03:35.230 INFO:tasks.ceph.osd.0.plana44.stderr:common/HeartbeatMap.cc: In function 'bool ceph::HeartbeatMap::_check(const ceph::heartbeat_handle_d*, const char*, time_t)' thread 7fe9507ea700 time 2015-11-09 01:03:35.218348
2015-11-09T01:03:35.230 INFO:tasks.ceph.osd.0.plana44.stderr:common/HeartbeatMap.cc: 81: FAILED assert(0 == "hit suicide timeout")
2015-11-09T01:03:35.243 INFO:tasks.ceph.osd.0.plana44.stderr: ceph version 9.2.0 (bb2ecea240f3a1d525bcb35670cb07bd1f0ca299)
2015-11-09T01:03:35.244 INFO:tasks.ceph.osd.0.plana44.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x7fe955f8f99b]
2015-11-09T01:03:35.244 INFO:tasks.ceph.osd.0.plana44.stderr: 2: (ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d const*, char const*, long)+0x2a9) [0x7fe955ed5a79]
2015-11-09T01:03:35.244 INFO:tasks.ceph.osd.0.plana44.stderr: 3: (ceph::HeartbeatMap::is_healthy()+0xb6) [0x7fe955ed6236]
2015-11-09T01:03:35.244 INFO:tasks.ceph.osd.0.plana44.stderr: 4: (ceph::HeartbeatMap::check_touch_file()+0x17) [0x7fe955ed69f7]
2015-11-09T01:03:35.244 INFO:tasks.ceph.osd.0.plana44.stderr: 5: (CephContextServiceThread::entry()+0x14b) [0x7fe955fab0cb]
2015-11-09T01:03:35.244 INFO:tasks.ceph.osd.0.plana44.stderr: 6: (()+0x8182) [0x7fe9545af182]
2015-11-09T01:03:35.244 INFO:tasks.ceph.osd.0.plana44.stderr: 7: (clone()+0x6d) [0x7fe9528f738d]
2015-11-09T01:03:35.245 INFO:tasks.ceph.osd.0.plana44.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2015-11-09T01:03:35.245 INFO:tasks.ceph.osd.0.plana44.stderr:2015-11-09 01:03:35.234727 7fe9507ea700 -1 common/HeartbeatMap.cc: In function 'bool ceph::HeartbeatMap::_check(const ceph::heartbeat_handle_d*, const char*, time_t)' thread 7fe9507ea700 time 2015-11-09 01:03:35.218348
2015-11-09T01:03:35.245 INFO:tasks.ceph.osd.0.plana44.stderr:common/HeartbeatMap.cc: 81: FAILED assert(0 == "hit suicide timeout")
2015-11-09T01:03:35.245 INFO:tasks.ceph.osd.0.plana44.stderr:
2015-11-09T01:03:35.245 INFO:tasks.ceph.osd.0.plana44.stderr: ceph version 9.2.0 (bb2ecea240f3a1d525bcb35670cb07bd1f0ca299)
2015-11-09T01:03:35.245 INFO:tasks.ceph.osd.0.plana44.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x7fe955f8f99b]
2015-11-09T01:03:35.246 INFO:tasks.ceph.osd.0.plana44.stderr: 2: (ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d const*, char const*, long)+0x2a9) [0x7fe955ed5a79]
2015-11-09T01:03:35.246 INFO:tasks.ceph.osd.0.plana44.stderr: 3: (ceph::HeartbeatMap::is_healthy()+0xb6) [0x7fe955ed6236]
2015-11-09T01:03:35.246 INFO:tasks.ceph.osd.0.plana44.stderr: 4: (ceph::HeartbeatMap::check_touch_file()+0x17) [0x7fe955ed69f7]
2015-11-09T01:03:35.246 INFO:tasks.ceph.osd.0.plana44.stderr: 5: (CephContextServiceThread::entry()+0x14b) [0x7fe955fab0cb]
2015-11-09T01:03:35.246 INFO:tasks.ceph.osd.0.plana44.stderr: 6: (()+0x8182) [0x7fe9545af182]
2015-11-09T01:03:35.246 INFO:tasks.ceph.osd.0.plana44.stderr: 7: (clone()+0x6d) [0x7fe9528f738d]
2015-11-09T01:03:35.247 INFO:tasks.ceph.osd.0.plana44.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Files

backtraces.txt (87.6 KB) backtraces.txt Samuel Just, 11/10/2015 05:51 PM
Actions

Also available in: Atom PDF