Bug #14021
closedosd coredump on rhel7.1
0%
Description
we find 2 coredumps in log file after some time of running, can anyone help to check this?
coredump1:
ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3)
1: /usr/bin/ceph-osd() [0xac5642]
2: (()+0xf130) [0x7faee962d130]
3: (gsignal()+0x37) [0x7faee80475d7]
4: (abort()+0x148) [0x7faee8048cc8]
5: (_gnu_cxx::_verbose_terminate_handler()+0x165) [0x7faee894b9b5]
6: (()+0x5e926) [0x7faee8949926]
7: (()+0x5e953) [0x7faee8949953]
8: (()+0x5eb73) [0x7faee8949b73]
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x27a) [0xbc583a]
10: (ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d*, char const*, long)+0x2d9) [0xafb449]
11: (ceph::HeartbeatMap::reset_timeout(ceph::heartbeat_handle_d*, long, long)+0x89) [0xafb769]
12: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x39b) [0x694c5b]
13: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x86f) [0xbb529f]
14: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xbb73d0]
15: (()+0x7df5) [0x7faee9625df5]
16: (clone()+0x6d) [0x7faee81081ad]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
coredump2:
ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0xbc5645]
2: (ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d*, char const*, long)+0x2d9) [0xafb449]
3: (ceph::HeartbeatMap::is_healthy()+0xde) [0xafbd3e]
4: (ceph::HeartbeatMap::check_touch_file()+0x2c) [0xafc45c]
5: (CephContextServiceThread::entry()+0x15b) [0xbd587b]
6: (()+0x7df5) [0x7fe2727abdf5]
7: (clone()+0x6d) [0x7fe27128e1ad]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Files
Updated by Samuel Just over 8 years ago
This looks like a thread timed out. You'll want to reproduce with logging and determine which thread.