Project

General

Profile

Actions

Bug #24835

closed

osd daemon spontaneous segfault

Added by Christian Schlittchen almost 6 years ago. Updated about 4 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

We experience spontaneous segmentation faults of osd daemons in our mimic production cluster:

Jul  9 14:54:13 ceph-osd-a-02 ceph-osd[34879]: *** Caught signal (Segmentation fault) **
Jul  9 14:54:13 ceph-osd-a-02 kernel: [1211417.797932] show_signal_msg: 5 callbacks suppressed
Jul  9 14:54:13 ceph-osd-a-02 kernel: [1211417.797938] msgr-worker-2[34887]: segfault at 0 ip 00007ff3d231dbcb sp 00007ff3cc5a5680 error 4 in libtcmalloc.so.4.3.0[7ff3d22f9000+45000]
Jul  9 14:54:13 ceph-osd-a-02 ceph-osd[34879]:  in thread 7ff3cc5b0700 thread_name:msgr-worker-2
Jul  9 14:54:13 ceph-osd-a-02 systemd[1]: ceph-osd@20.service: Main process exited, code=killed, status=11/SEGV
Jul  9 14:54:13 ceph-osd-a-02 systemd[1]: ceph-osd@20.service: Failed with result 'signal'.
Jul  9 14:54:33 ceph-osd-a-02 systemd[1]: ceph-osd@20.service: Service hold-off time over, scheduling restart.
Jul  9 14:54:33 ceph-osd-a-02 systemd[1]: ceph-osd@20.service: Scheduled restart job, restart counter is at 1.
Jul  9 14:54:33 ceph-osd-a-02 systemd[1]: Stopped Ceph object storage daemon osd.20.

Files

ceph-osd.20.log.1 (101 KB) ceph-osd.20.log.1 Christian Schlittchen, 07/10/2018 12:13 PM
Actions

Also available in: Atom PDF