Actions
Bug #18389
closedcrash when opening bluefs superblock
Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
We have a cluster that we are using for performance testing based on 11.1.0. The tests run for a whole day. We're seeing a number of coredumps generated with bluestore. They all seem to be when opening the superblock w/BlueFS. The OSD seem to recover however, and start correctly after a few tries. Here are two backtraces that we see:
#0 raise (sig=sig@entry=11) at ../sysdeps/unix/sysv/linux/raise.c:58 #1 0x000000000185c4ec in reraise_fatal (signum=11) at /home/rook/go/src/github.com/rook/rook/ceph/src/global/signal_handler.cc:72 #2 handle_fatal_signal (signum=11) at /home/rook/go/src/github.com/rook/rook/ceph/src/global/signal_handler.cc:134 #3 <signal handler called> #4 0x0000000001a7da33 in BlueFS::_replay (this=this@entry=0xcdd2000, noop=noop@entry=false) at /home/rook/go/src/github.com/rook/rook/ceph/src/os/bluestore/BlueFS.cc:628 #5 0x0000000001a80991 in BlueFS::mount (this=0xcdd2000) at /home/rook/go/src/github.com/rook/rook/ceph/src/os/bluestore/BlueFS.cc:364 #6 0x0000000001a17e0a in BlueStore::_open_db (this=this@entry=0xcb46000, create=create@entry=false) at /home/rook/go/src/github.com/rook/rook/ceph/src/os/bluestore/BlueStore.cc:3324 #7 0x0000000001a495dc in BlueStore::mount (this=0xcb46000) at /home/rook/go/src/github.com/rook/rook/ceph/src/os/bluestore/BlueStore.cc:3983 #8 0x00000000016d34e7 in OSD::init (this=0xccb2000) at /home/rook/go/src/github.com/rook/rook/ceph/src/osd/OSD.cc:2042 #9 0x00000000014b9d67 in cephd_osd (argc=<optimized out>, argv=<optimized out>) at /home/rook/go/src/github.com/rook/rook/ceph/src/ceph_osd.cc:611 #10 0x0000000000dfb215 in cephd_run_osd (argc=<optimized out>, argv=<optimized out>) at /home/rook/go/src/github.com/rook/rook/ceph/src/libcephd/libcephd.cc:237 #11 0x0000000000c1213a in _cgo_49d41b80151a_Cfunc_cephd_run_osd (v=0xc4201a5b28) at /home/rook/go/src/github.com/rook/rook/pkg/cephmgr/cephd/cephd.go:204 #12 0x00000000004f36f0 in runtime.asmcgocall () at /usr/local/go/src/runtime/asm_amd64.s:590 #13 0x000000000049846d in runtime.cgocall (fn=0x7ffff67cc250, arg=0x325e580 <runtime.g0>, ~r2=-159595968) at /usr/local/go/src/runtime/cgocall.go:115 #14 0x000000000325e500 in runtime.work () #15 0x00007ffff67cc250 in ?? () #16 0x000000000325e580 in ?? () #17 0x00007ffff67cc240 in ?? () #18 0x00000000004c62e4 in runtime.mstart () at /usr/local/go/src/runtime/proc.go:1096 #19 0x00000000004f1845 in runtime.rt0_go () at /usr/local/go/src/runtime/asm_amd64.s:156 #20 0x000000000000000b in ?? () #21 0x00007ffff67cc3a8 in ?? () #22 0x000000000000000b in ?? () #23 0x00007ffff67cc3a8 in ?? () #24 0x0000000000400598 in __rela_iplt_start () #25 0x00000000022bab16 in generic_start_main () #26 0x00000000022bad9e in __libc_start_main () #27 0x000000000049307a in _start ()
#0 raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:58 #1 0x000000000185c4ec in reraise_fatal (signum=6) at /home/rook/go/src/github.com/rook/rook/ceph/src/global/signal_handler.cc:72 #2 handle_fatal_signal (signum=6) at /home/rook/go/src/github.com/rook/rook/ceph/src/global/signal_handler.cc:134 #3 <signal handler called> #4 raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:58 #5 0x00000000022cbb9a in abort () #6 0x000000000226e31d in __gnu_cxx::__verbose_terminate_handler() () #7 0x00000000021dfcb6 in __cxxabiv1::__terminate(void (*)()) () #8 0x00000000021dfd01 in std::terminate() () #9 0x00000000021e05e9 in __cxa_throw () #10 0x0000000001da6ca3 in bluefs_super_t::decode (this=this@entry=0xcbb80e8, p=...) at /home/rook/go/src/github.com/rook/rook/ceph/src/os/bluestore/bluefs_types.cc:73 #11 0x0000000001a6a1e3 in decode (p=..., c=...) at /home/rook/go/src/github.com/rook/rook/ceph/src/os/bluestore/bluefs_types.h:77 #12 BlueFS::_open_super (this=this@entry=0xcbb8000) at /home/rook/go/src/github.com/rook/rook/ceph/src/os/bluestore/BlueFS.cc:458 #13 0x0000000001a808a1 in BlueFS::mount (this=0xcbb8000) at /home/rook/go/src/github.com/rook/rook/ceph/src/os/bluestore/BlueFS.cc:352 #14 0x0000000001a17e0a in BlueStore::_open_db (this=this@entry=0xc92c000, create=create@entry=false) at /home/rook/go/src/github.com/rook/rook/ceph/src/os/bluestore/BlueStore.cc:3324 #15 0x0000000001a495dc in BlueStore::mount (this=0xc92c000) at /home/rook/go/src/github.com/rook/rook/ceph/src/os/bluestore/BlueStore.cc:3983 #16 0x00000000016d34e7 in OSD::init (this=0xca98000) at /home/rook/go/src/github.com/rook/rook/ceph/src/osd/OSD.cc:2042 #17 0x00000000014b9d67 in cephd_osd (argc=<optimized out>, argv=<optimized out>) at /home/rook/go/src/github.com/rook/rook/ceph/src/ceph_osd.cc:611 #18 0x0000000000dfb215 in cephd_run_osd (argc=<optimized out>, argv=<optimized out>) at /home/rook/go/src/github.com/rook/rook/ceph/src/libcephd/libcephd.cc:237 #19 0x0000000000c1213a in _cgo_49d41b80151a_Cfunc_cephd_run_osd (v=0xc4201a5b28) at /home/rook/go/src/github.com/rook/rook/pkg/cephmgr/cephd/cephd.go:204 #20 0x00000000004f36f0 in runtime.asmcgocall () at /usr/local/go/src/runtime/asm_amd64.s:590 #21 0x000000000049846d in runtime.cgocall (fn=0x7ffe0ac80a80, arg=0x325e580 <runtime.g0>, ~r2=180882032) at /usr/local/go/src/runtime/cgocall.go:115 #22 0x000000000325e500 in runtime.work () #23 0x00007ffe0ac80a80 in ?? () #24 0x000000000325e580 in ?? () #25 0x00007ffe0ac80a70 in ?? () #26 0x00000000004c62e4 in runtime.mstart () at /usr/local/go/src/runtime/proc.go:1096 #27 0x00000000004f1845 in runtime.rt0_go () at /usr/local/go/src/runtime/asm_amd64.s:156 #28 0x000000000000000b in ?? () #29 0x00007ffe0ac80bd8 in ?? () #30 0x000000000000000b in ?? () #31 0x00007ffe0ac80bd8 in ?? () #32 0x0000000000400598 in __rela_iplt_start () #33 0x00000000022bab16 in generic_start_main () #34 0x00000000022bad9e in __libc_start_main () #35 0x000000000049307a in _start ()
Updated by Greg Farnum almost 7 years ago
- Project changed from Ceph to RADOS
- Category set to Correctness/Safety
- Component(RADOS) BlueStore added
Updated by Greg Farnum over 6 years ago
- Project changed from RADOS to bluestore
- Category deleted (
Correctness/Safety)
Updated by Sage Weil over 6 years ago
- Status changed from New to Can't reproduce
Updated by Radoslaw Zarzynski over 5 years ago
- Related to Bug #25098: Bluestore OSD failed to start with `bluefs_types.h: 54: FAILED assert(pos <= end)` added
Actions