Project

General

Profile

Actions

Bug #16400

closed

Ceph OSD crashes suddenly after restart when using bluestore

Added by Yuri Gorshkov almost 8 years ago. Updated over 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
OSD
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
bluestore
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi there.

We've set up a small 4-node Bluestore cluster for testing and I noticed that sometimes the OSD would not start after a shutdown, throwing assertion immediately. Here's a snippet from the OSD stdout and I've attached the OSD log:

2016-06-21 18:14:22.956887 7f376e0bd800 -1 WARNING: the following dangerous and experimental features are enabled: bluestore,rocksdb
2016-06-21 18:14:22.957140 7f376e0bd800 -1 WARNING: the following dangerous and experimental features are enabled: bluestore,rocksdb
2016-06-21 18:14:22.957229 7f376e0bd800 -1 WARNING: experimental feature 'bluestore' is enabled
Please be aware that this feature is experimental, untested,
unsupported, and may result in data corruption, data loss,
and/or irreparable damage to your cluster. Do not use
feature with important data.

starting osd.33 at :/0 osd_data /var/lib/ceph/osd/cephsml-33 /var/lib/ceph/osd/cephsml-33/journal
2016-06-21 18:14:22.979703 7f376e0bd800 -1 WARNING: the following dangerous and experimental features are enabled: bluestore,rocksdb
2016-06-21 18:14:23.033441 7f376e0bd800 -1 WARNING: experimental feature 'rocksdb' is enabled
Please be aware that this feature is experimental, untested,
unsupported, and may result in data corruption, data loss,
and/or irreparable damage to your cluster. Do not use
feature with important data.

2016-06-21 18:14:24.024998 7f376e0bd800 -1 WARNING: experimental feature 'rocksdb' is enabled
Please be aware that this feature is experimental, untested,
unsupported, and may result in data corruption, data loss,
and/or irreparable damage to your cluster. Do not use
feature with important data.

osd/OSD.h: In function 'OSDMapRef OSDService::get_map(epoch_t)' thread 7f376e0bd800 time 2016-06-21 18:14:24.088885
osd/OSD.h: 885: FAILED assert(ret)
ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0x7f376eaea5b5]
2: (OSDService::get_map(unsigned int)+0x3d) [0x7f376e4c893d]
3: (OSD::init()+0x1fe2) [0x7f376e47bdb2]
4: (main()+0x2c55) [0x7f376e3dfbe5]
5: (__libc_start_main()+0xf5) [0x7f376afceb15]
6: (()+0x353009) [0x7f376e42a009]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2016-06-21 18:14:24.090582 7f376e0bd800 -1 osd/OSD.h: In function 'OSDMapRef OSDService::get_map(epoch_t)' thread 7f376e0bd800 time 2016-06-21 18:14:24.088885
osd/OSD.h: 885: FAILED assert(ret)

ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0x7f376eaea5b5]
2: (OSDService::get_map(unsigned int)+0x3d) [0x7f376e4c893d]
3: (OSD::init()+0x1fe2) [0x7f376e47bdb2]
4: (main()+0x2c55) [0x7f376e3dfbe5]
5: (__libc_start_main()+0xf5) [0x7f376afceb15]
6: (()+0x353009) [0x7f376e42a009]
NOTE: a copy of the executable, or `objdump -rdS &lt;executable&gt;` is needed to interpret this.
-665> 2016-06-21 18:14:22.956887 7f376e0bd800 -1 WARNING: the following dangerous and experimental features are enabled: bluestore,rocksdb
-664> 2016-06-21 18:14:22.957140 7f376e0bd800 -1 WARNING: the following dangerous and experimental features are enabled: bluestore,rocksdb
-660> 2016-06-21 18:14:22.957229 7f376e0bd800 -1 WARNING: experimental feature 'bluestore' is enabled
Please be aware that this feature is experimental, untested,
unsupported, and may result in data corruption, data loss,
and/or irreparable damage to your cluster. Do not use
feature with important data.
-650> 2016-06-21 18:14:22.979703 7f376e0bd800 -1 WARNING: the following dangerous and experimental features are enabled: bluestore,rocksdb
-620> 2016-06-21 18:14:23.033441 7f376e0bd800 -1 WARNING: experimental feature 'rocksdb' is enabled
Please be aware that this feature is experimental, untested,
unsupported, and may result in data corruption, data loss,
and/or irreparable damage to your cluster. Do not use
feature with important data.
-162> 2016-06-21 18:14:24.024998 7f376e0bd800 -1 WARNING: experimental feature 'rocksdb' is enabled
Please be aware that this feature is experimental, untested,
unsupported, and may result in data corruption, data loss,
and/or irreparable damage to your cluster. Do not use
feature with important data.
0> 2016-06-21 18:14:24.090582 7f376e0bd800 -1 osd/OSD.h: In function 'OSDMapRef OSDService::get_map(epoch_t)' thread 7f376e0bd800 time 2016-06-21 18:14:24.088885
osd/OSD.h: 885: FAILED assert(ret)
ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0x7f376eaea5b5]
2: (OSDService::get_map(unsigned int)+0x3d) [0x7f376e4c893d]
3: (OSD::init()+0x1fe2) [0x7f376e47bdb2]
4: (main()+0x2c55) [0x7f376e3dfbe5]
5: (__libc_start_main()+0xf5) [0x7f376afceb15]
6: (()+0x353009) [0x7f376e42a009]
NOTE: a copy of the executable, or `objdump -rdS &lt;executable&gt;` is needed to interpret this.
  • Caught signal (Aborted)
    in thread 7f376e0bd800 thread_name:ceph-osd
    ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
    1: (()+0x91341a) [0x7f376e9ea41a]
    2: (()+0xf100) [0x7f376ca20100]
    3: (gsignal()+0x37) [0x7f376afe25f7]
    4: (abort()+0x148) [0x7f376afe3ce8]
    5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x267) [0x7f376eaea797]
    6: (OSDService::get_map(unsigned int)+0x3d) [0x7f376e4c893d]
    7: (OSD::init()+0x1fe2) [0x7f376e47bdb2]
    8: (main()+0x2c55) [0x7f376e3dfbe5]
    9: (__libc_start_main()+0xf5) [0x7f376afceb15]
    10: (()+0x353009) [0x7f376e42a009]
    2016-06-21 18:14:24.095919 7f376e0bd800 -1
    Caught signal (Aborted) *
    in thread 7f376e0bd800 thread_name:ceph-osd

    ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
    1: (()+0x91341a) [0x7f376e9ea41a]
    2: (()+0xf100) [0x7f376ca20100]
    3: (gsignal()+0x37) [0x7f376afe25f7]
    4: (abort()+0x148) [0x7f376afe3ce8]
    5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x267) [0x7f376eaea797]
    6: (OSDService::get_map(unsigned int)+0x3d) [0x7f376e4c893d]
    7: (OSD::init()+0x1fe2) [0x7f376e47bdb2]
    8: (main()+0x2c55) [0x7f376e3dfbe5]
    9: (__libc_start_main()+0xf5) [0x7f376afceb15]
    10: (()+0x353009) [0x7f376e42a009]
    NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

    0> 2016-06-21 18:14:24.095919 7f376e0bd800 -1 ** Caught signal (Aborted) *
    in thread 7f376e0bd800 thread_name:ceph-osd

    ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
    1: (()+0x91341a) [0x7f376e9ea41a]
    2: (()+0xf100) [0x7f376ca20100]
    3: (gsignal()+0x37) [0x7f376afe25f7]
    4: (abort()+0x148) [0x7f376afe3ce8]
    5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x267) [0x7f376eaea797]
    6: (OSDService::get_map(unsigned int)+0x3d) [0x7f376e4c893d]
    7: (OSD::init()+0x1fe2) [0x7f376e47bdb2]
    8: (main()+0x2c55) [0x7f376e3dfbe5]
    9: (__libc_start_main()+0xf5) [0x7f376afceb15]
    10: (()+0x353009) [0x7f376e42a009]
    NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Aborted


Files

cephsml-osd.33.log.gz (21.7 KB) cephsml-osd.33.log.gz Yuri Gorshkov, 06/21/2016 03:13 PM
osdlog.tar (210 KB) osdlog.tar Yuri Gorshkov, 06/22/2016 03:22 PM
cephsml-osd.1.log (311 KB) cephsml-osd.1.log Yuri Gorshkov, 08/11/2016 12:21 PM
Actions

Also available in: Atom PDF