Project

General

Profile

Actions

Bug #19925

closed

Bluestore osd crashes on start

Added by K Jarrett almost 7 years ago. Updated almost 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I'm seeing an issue very similar to that seen in #16278 - whereby a Bluestore OSD crashes upon start with the following stack trace;

2017-05-14 01:41:22.076972 7fb7d500d8c0 -1 *** Caught signal (Aborted) **
 in thread 7fb7d500d8c0 thread_name:ceph-osd

 ceph version 10.2.7 (50e863e0f4bc8f4b9e31156de690d765af245185)
 1: (()+0x9770ae) [0x5584161fb0ae]
 2: (()+0x11390) [0x7fb7d3eca390]
 3: (gsignal()+0x38) [0x7fb7d1e67428]
 4: (abort()+0x16a) [0x7fb7d1e6902a]
 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x26b) [0x5584162fb54b]
 6: (OSDService::get_map(unsigned int)+0x5d) [0x558415c7252d]
 7: (OSD::init()+0x1f91) [0x558415c21161]
 8: (main()+0x2ea5) [0x558415b92dc5]
 9: (__libc_start_main()+0xf0) [0x7fb7d1e52830]
 10: (_start()+0x29) [0x558415bd4459]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

I suspect that this could be related to data corruption, but don't know for sure. Shortly before this, I was observing pg degradation on the pool. Currently, 2 out of my 3 OSD nodes are failing to start with this stack trace, so only a single OSD is live;

ID WEIGHT  TYPE NAME        UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 0.00027 root default
-2 0.00009     host ceph-01
 0 0.00009         osd.0       down        0          1.00000
-3 0.00009     host ceph-02
 1 0.00009         osd.1         up  1.00000          1.00000
-4 0.00009     host ceph-03
 2 0.00009         osd.2       down        0          1.00000

As a result,

ceph health detail
is reporting every PG as undersized+degraded+peered, with only a single node acting.

How can I best troubleshoot the stack trace that the OSD is outputting? I've attached the relevant log output, but I can't seem to spot anything relevant.


Files

ceph-osd.2.log (102 KB) ceph-osd.2.log K Jarrett, 05/14/2017 01:55 AM
Actions #1

Updated by K Jarrett almost 7 years ago

I've just noticed that i'm running 10.2.7, and there have been a lot of enhancements to Bluestore since then. I thought I had installed 11.x.x, but it would appear not!

I understand that Bluestore can't be upgraded between 10.x and 11.x, so I may need to destroy my current setup.

Actions #2

Updated by Greg Farnum almost 7 years ago

  • Status changed from New to Closed
Actions

Also available in: Atom PDF