Project

General

Profile

Actions

Bug #19490

closed

segfault when flushing journal

Added by Ryan Anstey about 7 years ago. Updated about 7 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

# service ceph-osd@9 stop
# ceph-osd -i 9 --flush-journal
*** Caught signal (Segmentation fault) **
 in thread 7f316ee1f700 thread_name:ceph-osd
 ceph version 10.2.4-211-g12b091b (12b091b4a40947aa43919e71a318ed0dcedc8734)
 1: (()+0x9072f2) [0x55e41aa1e2f2]
 2: (()+0x10b00) [0x7f3173882b00]
2017-04-04 09:15:45.696166 7f316ee1f700 -1 *** Caught signal (Segmentation fault) **
 in thread 7f316ee1f700 thread_name:ceph-osd

 ceph version 10.2.4-211-g12b091b (12b091b4a40947aa43919e71a318ed0dcedc8734)
 1: (()+0x9072f2) [0x55e41aa1e2f2]
 2: (()+0x10b00) [0x7f3173882b00]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

     0> 2017-04-04 09:15:45.696166 7f316ee1f700 -1 *** Caught signal (Segmentation fault) **
 in thread 7f316ee1f700 thread_name:ceph-osd

 ceph version 10.2.4-211-g12b091b (12b091b4a40947aa43919e71a318ed0dcedc8734)
 1: (()+0x9072f2) [0x55e41aa1e2f2]
 2: (()+0x10b00) [0x7f3173882b00]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Segmentation fault (core dumped)

If you need any other information please let me know as I'm not too familiar with reporting issues.

Actions #1

Updated by Ryan Anstey about 7 years ago

I should mention I'm working on a reoccurring issue with inconsistent pgs popping up since upgrading to jewel. I found that one of the bad pgs is on this server, but I can't flush the journal (as per one of the steps to fixing the problem).

Actions #2

Updated by Ryan Anstey about 7 years ago

Unfortunately this seems intermittent because after repeating this problem multiple times and then taking the time to submit this, I started the OSD, stopped it and now the journal flushes properly.

Actions #3

Updated by Greg Farnum about 7 years ago

  • Status changed from New to Can't reproduce

Unfortunately there's not enough info in that backtrace to go on, and I haven't seen anything like this elsewhere. If you see it again, set "debug osd = 20" and "debug filestore = 20" on your OSD's ceph.conf and try it again; that should get us going.

Actions

Also available in: Atom PDF