Project

General

Profile

Actions

Bug #4116

closed

common/HeartbeatMap.cc: 79: FAILED assert(0 == "hit suicide timeout")

Added by Evan Felix about 11 years ago. Updated about 11 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
OSD
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
0.56.2
Backport:
Regression:
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I have been experiencing slowness lately, with lots of messages in ceph -w about operations taking a long time. During this i keep loosing OSD's with a hit suicide timeout assert.

ceph version 0.56.2 (586538e22afba85c59beda49789ec42024e7a061)
1: (ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d*, char const*, long)+0x307) [0x895387]
2: (ceph::HeartbeatMap::is_healthy()+0xa7) [0x895ba7]
3: (ceph::HeartbeatMap::check_touch_file()+0x28) [0x896098]
4: (CephContextServiceThread::entry()+0x5d) [0x7b871d]
5: (()+0x7851) [0x7ff2fbd26851]
6: (clone()+0x6d) [0x7ff2fa38511d]

here is the stack trace, but from other bugs it seems that this is caused probably by other issues.

I am attaching logs and objdump files. If there is more debug i can turn on that would help determine the issue please advise.

osd.8 uses up ~13G of memory before crashing. whereas other osd's use about 2g normally.


Files

osd.0.log.gz (170 KB) osd.0.log.gz osd.0 log Evan Felix, 02/13/2013 10:48 AM
osd.8.log.gz (205 KB) osd.8.log.gz osd.8 log Evan Felix, 02/13/2013 10:48 AM
cepd-osd.objdump.gz (10.1 MB) cepd-osd.objdump.gz 0.56.2 objdump Evan Felix, 02/13/2013 10:48 AM
osd.5.log.gz (2.65 MB) osd.5.log.gz log for 5 Evan Felix, 02/14/2013 10:51 AM
osd.5.log.gz (2.65 MB) osd.5.log.gz log for 5 Evan Felix, 02/14/2013 11:22 AM
gdb.txt (226 KB) gdb.txt pg dump, and backtrace. Evan Felix, 02/18/2013 09:58 AM
gdb.txt (51.7 KB) gdb.txt full back trace of all threads. Evan Felix, 02/18/2013 10:44 AM
osd.5.log.10.gz (3.16 MB) osd.5.log.10.gz debug log with more debug Evan Felix, 02/18/2013 02:16 PM
pg4.0.log (3.87 KB) pg4.0.log pg 4.0 query Evan Felix, 02/19/2013 09:43 AM
ceph-osd.37.log.gz (2.85 MB) ceph-osd.37.log.gz osd.37 suicide timeout Wido den Hollander, 03/29/2013 02:43 PM

Related issues 1 (0 open1 closed)

Related to Ceph - Fix #4192: osd: fix log trimmingResolvedSage Weil02/19/2013

Actions
Actions

Also available in: Atom PDF