Project

General

Profile

Actions

Bug #18389

closed

crash when opening bluefs superblock

Added by Bassam Tabbara over 7 years ago. Updated over 6 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

We have a cluster that we are using for performance testing based on 11.1.0. The tests run for a whole day. We're seeing a number of coredumps generated with bluestore. They all seem to be when opening the superblock w/BlueFS. The OSD seem to recover however, and start correctly after a few tries. Here are two backtraces that we see:

#0  raise (sig=sig@entry=11) at ../sysdeps/unix/sysv/linux/raise.c:58
#1  0x000000000185c4ec in reraise_fatal (signum=11)
    at /home/rook/go/src/github.com/rook/rook/ceph/src/global/signal_handler.cc:72
#2  handle_fatal_signal (signum=11) at /home/rook/go/src/github.com/rook/rook/ceph/src/global/signal_handler.cc:134
#3  <signal handler called>
#4  0x0000000001a7da33 in BlueFS::_replay (this=this@entry=0xcdd2000, noop=noop@entry=false)
    at /home/rook/go/src/github.com/rook/rook/ceph/src/os/bluestore/BlueFS.cc:628
#5  0x0000000001a80991 in BlueFS::mount (this=0xcdd2000)
    at /home/rook/go/src/github.com/rook/rook/ceph/src/os/bluestore/BlueFS.cc:364
#6  0x0000000001a17e0a in BlueStore::_open_db (this=this@entry=0xcb46000, create=create@entry=false)
    at /home/rook/go/src/github.com/rook/rook/ceph/src/os/bluestore/BlueStore.cc:3324
#7  0x0000000001a495dc in BlueStore::mount (this=0xcb46000)
    at /home/rook/go/src/github.com/rook/rook/ceph/src/os/bluestore/BlueStore.cc:3983
#8  0x00000000016d34e7 in OSD::init (this=0xccb2000)
    at /home/rook/go/src/github.com/rook/rook/ceph/src/osd/OSD.cc:2042
#9  0x00000000014b9d67 in cephd_osd (argc=<optimized out>, argv=<optimized out>)
    at /home/rook/go/src/github.com/rook/rook/ceph/src/ceph_osd.cc:611
#10 0x0000000000dfb215 in cephd_run_osd (argc=<optimized out>, argv=<optimized out>)
    at /home/rook/go/src/github.com/rook/rook/ceph/src/libcephd/libcephd.cc:237
#11 0x0000000000c1213a in _cgo_49d41b80151a_Cfunc_cephd_run_osd (v=0xc4201a5b28)
    at /home/rook/go/src/github.com/rook/rook/pkg/cephmgr/cephd/cephd.go:204
#12 0x00000000004f36f0 in runtime.asmcgocall () at /usr/local/go/src/runtime/asm_amd64.s:590
#13 0x000000000049846d in runtime.cgocall (fn=0x7ffff67cc250, arg=0x325e580 <runtime.g0>, ~r2=-159595968)
    at /usr/local/go/src/runtime/cgocall.go:115
#14 0x000000000325e500 in runtime.work ()
#15 0x00007ffff67cc250 in ?? ()
#16 0x000000000325e580 in ?? ()
#17 0x00007ffff67cc240 in ?? ()
#18 0x00000000004c62e4 in runtime.mstart () at /usr/local/go/src/runtime/proc.go:1096
#19 0x00000000004f1845 in runtime.rt0_go () at /usr/local/go/src/runtime/asm_amd64.s:156
#20 0x000000000000000b in ?? ()
#21 0x00007ffff67cc3a8 in ?? ()
#22 0x000000000000000b in ?? ()
#23 0x00007ffff67cc3a8 in ?? ()
#24 0x0000000000400598 in __rela_iplt_start ()
#25 0x00000000022bab16 in generic_start_main ()
#26 0x00000000022bad9e in __libc_start_main ()
#27 0x000000000049307a in _start ()
#0  raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:58
#1  0x000000000185c4ec in reraise_fatal (signum=6)
    at /home/rook/go/src/github.com/rook/rook/ceph/src/global/signal_handler.cc:72
#2  handle_fatal_signal (signum=6) at /home/rook/go/src/github.com/rook/rook/ceph/src/global/signal_handler.cc:134
#3  <signal handler called>
#4  raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:58
#5  0x00000000022cbb9a in abort ()
#6  0x000000000226e31d in __gnu_cxx::__verbose_terminate_handler() ()
#7  0x00000000021dfcb6 in __cxxabiv1::__terminate(void (*)()) ()
#8  0x00000000021dfd01 in std::terminate() ()
#9  0x00000000021e05e9 in __cxa_throw ()
#10 0x0000000001da6ca3 in bluefs_super_t::decode (this=this@entry=0xcbb80e8, p=...)
    at /home/rook/go/src/github.com/rook/rook/ceph/src/os/bluestore/bluefs_types.cc:73
#11 0x0000000001a6a1e3 in decode (p=..., c=...)
    at /home/rook/go/src/github.com/rook/rook/ceph/src/os/bluestore/bluefs_types.h:77
#12 BlueFS::_open_super (this=this@entry=0xcbb8000)
    at /home/rook/go/src/github.com/rook/rook/ceph/src/os/bluestore/BlueFS.cc:458
#13 0x0000000001a808a1 in BlueFS::mount (this=0xcbb8000)
    at /home/rook/go/src/github.com/rook/rook/ceph/src/os/bluestore/BlueFS.cc:352
#14 0x0000000001a17e0a in BlueStore::_open_db (this=this@entry=0xc92c000, create=create@entry=false)
    at /home/rook/go/src/github.com/rook/rook/ceph/src/os/bluestore/BlueStore.cc:3324
#15 0x0000000001a495dc in BlueStore::mount (this=0xc92c000)
    at /home/rook/go/src/github.com/rook/rook/ceph/src/os/bluestore/BlueStore.cc:3983
#16 0x00000000016d34e7 in OSD::init (this=0xca98000)
    at /home/rook/go/src/github.com/rook/rook/ceph/src/osd/OSD.cc:2042
#17 0x00000000014b9d67 in cephd_osd (argc=<optimized out>, argv=<optimized out>)
    at /home/rook/go/src/github.com/rook/rook/ceph/src/ceph_osd.cc:611
#18 0x0000000000dfb215 in cephd_run_osd (argc=<optimized out>, argv=<optimized out>)
    at /home/rook/go/src/github.com/rook/rook/ceph/src/libcephd/libcephd.cc:237
#19 0x0000000000c1213a in _cgo_49d41b80151a_Cfunc_cephd_run_osd (v=0xc4201a5b28)
    at /home/rook/go/src/github.com/rook/rook/pkg/cephmgr/cephd/cephd.go:204
#20 0x00000000004f36f0 in runtime.asmcgocall () at /usr/local/go/src/runtime/asm_amd64.s:590
#21 0x000000000049846d in runtime.cgocall (fn=0x7ffe0ac80a80, arg=0x325e580 <runtime.g0>, ~r2=180882032)
    at /usr/local/go/src/runtime/cgocall.go:115
#22 0x000000000325e500 in runtime.work ()
#23 0x00007ffe0ac80a80 in ?? ()
#24 0x000000000325e580 in ?? ()
#25 0x00007ffe0ac80a70 in ?? ()
#26 0x00000000004c62e4 in runtime.mstart () at /usr/local/go/src/runtime/proc.go:1096
#27 0x00000000004f1845 in runtime.rt0_go () at /usr/local/go/src/runtime/asm_amd64.s:156
#28 0x000000000000000b in ?? ()
#29 0x00007ffe0ac80bd8 in ?? ()
#30 0x000000000000000b in ?? ()
#31 0x00007ffe0ac80bd8 in ?? ()
#32 0x0000000000400598 in __rela_iplt_start ()
#33 0x00000000022bab16 in generic_start_main ()
#34 0x00000000022bad9e in __libc_start_main ()
#35 0x000000000049307a in _start ()

Related issues 1 (0 open1 closed)

Related to bluestore - Bug #25098: Bluestore OSD failed to start with `bluefs_types.h: 54: FAILED assert(pos <= end)`ResolvedRadoslaw Zarzynski07/25/2018

Actions
Actions #1

Updated by Greg Farnum almost 7 years ago

  • Project changed from Ceph to RADOS
  • Category set to Correctness/Safety
  • Component(RADOS) BlueStore added
Actions #2

Updated by Greg Farnum over 6 years ago

  • Project changed from RADOS to bluestore
  • Category deleted (Correctness/Safety)
Actions #3

Updated by Sage Weil over 6 years ago

  • Status changed from New to Can't reproduce
Actions #4

Updated by Radoslaw Zarzynski over 5 years ago

  • Related to Bug #25098: Bluestore OSD failed to start with `bluefs_types.h: 54: FAILED assert(pos <= end)` added
Actions

Also available in: Atom PDF