Project

General

Profile

Actions

Bug #14021

closed

osd coredump on rhel7.1

Added by Clive Xu over 8 years ago. Updated about 8 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
ceph-deploy
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

we find 2 coredumps in log file after some time of running, can anyone help to check this?
coredump1:
ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3)
1: /usr/bin/ceph-osd() [0xac5642]
2: (()+0xf130) [0x7faee962d130]
3: (gsignal()+0x37) [0x7faee80475d7]
4: (abort()+0x148) [0x7faee8048cc8]
5: (_gnu_cxx::_verbose_terminate_handler()+0x165) [0x7faee894b9b5]
6: (()+0x5e926) [0x7faee8949926]
7: (()+0x5e953) [0x7faee8949953]
8: (()+0x5eb73) [0x7faee8949b73]
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x27a) [0xbc583a]
10: (ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d*, char const*, long)+0x2d9) [0xafb449]
11: (ceph::HeartbeatMap::reset_timeout(ceph::heartbeat_handle_d*, long, long)+0x89) [0xafb769]
12: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x39b) [0x694c5b]
13: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x86f) [0xbb529f]
14: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xbb73d0]
15: (()+0x7df5) [0x7faee9625df5]
16: (clone()+0x6d) [0x7faee81081ad]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

coredump2:
ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0xbc5645]
2: (ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d*, char const*, long)+0x2d9) [0xafb449]
3: (ceph::HeartbeatMap::is_healthy()+0xde) [0xafbd3e]
4: (ceph::HeartbeatMap::check_touch_file()+0x2c) [0xafc45c]
5: (CephContextServiceThread::entry()+0x15b) [0xbd587b]
6: (()+0x7df5) [0x7fe2727abdf5]
7: (clone()+0x6d) [0x7fe27128e1ad]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.


Files

coredump1.log (5.27 KB) coredump1.log Clive Xu, 12/08/2015 08:01 AM
coredump2.log (21 KB) coredump2.log Clive Xu, 12/08/2015 08:01 AM
Actions #1

Updated by Samuel Just over 8 years ago

This looks like a thread timed out. You'll want to reproduce with logging and determine which thread.

Actions #2

Updated by Samuel Just about 8 years ago

  • Status changed from New to Closed
Actions

Also available in: Atom PDF