Project

General

Profile

Actions

Bug #45985

closed

crimson-osd mkfs segmentation fault on shard 0

Added by Honggang Yang almost 4 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I want to setup a cluster with the following cmd:

# MGR=1 MON=1 OSD=1 MDS=0 RGW=0 bash -x ../src/vstart.sh -n -X --without-dashboard -b --crimson --nodaemon --redirect-output
...
+ /data/work/ceph-master/build/bin/crimson-osd -i 0 -c /data/work/ceph-master/build/ceph.conf --mkfs --key AQCtLONeMlkHLhAAPMD34Rd3QwcCOvuetC7ydw== --osd-uuid 0d0d3069-8bda-4e87-af0c-bb25d039a00d --smp 1 --cpuset 0
WARN  2020-06-12 15:20:15,920 [shard 0] seastar - Unable to set SCHED_FIFO scheduling policy for timer thread; latency impact possible. Try adding CAP_SYS_NICE
WARN  2020-06-12 15:20:15,934 [shard 0] osd - _load_class could not open class /data/work/ceph-master/build/lib/libcls_queue.so (dlopen failed): /data/work/ceph-master/build/lib/libcls_queue.so: undefined symbol: _Z18cls_cxx_write_zeroPvii
WARN  2020-06-12 15:20:15,934 [shard 0] osd - OSD warning: got an error loading one or more classes: (5) Input/output error
Segmentation fault on shard 0.
Backtrace:
  0x0000000001426048
  0x00000000013ecc6f
  0x00000000013ecd63
  0x00000000013ecdd5
  /lib64/libpthread.so.0+0x000000000000f5ef
  0x00000000013d60bd
  0x0000000001404c97
  0x00000000013e98a8
  0x00000000013e9bf3
  0x0000000001416835
  0x00000000013c0214
  0x00000000008a297a
  /lib64/libc.so.6+0x0000000000022504
  0x00000000008ece76
../src/vstart.sh: line 829: 3195510 Segmentation fault      (core dumped) $CEPH_BIN/$ceph_osd $extra_osd_args -i $osd $ARGS --mkfs --key $OSD_SECRET --osd-uuid $uuid $extra_seastar_args

The error bt is:

# ../src/seastar/scripts/seastar-addr2line -e bin/crimson-osd
bin/crimson-osd: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=0d8d92bb77ab460e44d0eaa036bad31fd97d1580, not stripped

  0x0000000001426048
  0x00000000013ecc6f
  0x00000000013ecd63
  0x00000000013ecdd5
  /lib64/libpthread.so.0+0x000000000000f5ef
  0x00000000013d60bd
  0x0000000001404c97
  0x00000000013e98a8
  0x00000000013e9bf3
  0x0000000001416835
  0x00000000013c0214
  0x00000000008a297a
  /lib64/libc.so.6+0x0000000000022504
  0x00000000008ece76

[Backtrace #0]
void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at /data/work/ceph-master/src/seastar/include/seastar/util/backtrace.hh:56
seastar::backtrace_buffer::append_backtrace() at /data/work/ceph-master/src/seastar/src/core/reactor.cc:741
 (inlined by) print_with_backtrace at /data/work/ceph-master/src/seastar/src/core/reactor.cc:762
seastar::print_with_backtrace(char const*) at /data/work/ceph-master/src/seastar/src/core/reactor.cc:769
sigsegv_action at /data/work/ceph-master/src/seastar/src/core/reactor.cc:3469
 (inlined by) operator() at /data/work/ceph-master/src/seastar/src/core/reactor.cc:3455
 (inlined by) _FUN at /data/work/ceph-master/src/seastar/src/core/reactor.cc:3451
_L_unlock_13 at funlockfile.c:?
std::__atomic_base<seastar::memory::cross_cpu_free_item*>::load(std::memory_order) const at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/atomic_base.h:713
 (inlined by) std::atomic<seastar::memory::cross_cpu_free_item*>::load(std::memory_order) const at /opt/rh/devtoolset-8/root/usr/include/c++/8/atomic:452
 (inlined by) seastar::memory::cpu_pages::free_cross_cpu(unsigned int, void*) at /data/work/ceph-master/src/seastar/src/core/memory.cc:803
 (inlined by) seastar::memory::cpu_pages::free_cross_cpu(unsigned int, void*) at /data/work/ceph-master/src/seastar/src/core/memory.cc:795
seastar::continuation<seastar::internal::promise_base_with_type<unsigned long>, seastar::future<>::then_impl_nrvo<seastar::readable_eventfd::wait()::{lambda()#1}, seastar::future<unsigned long> >(seastar::readable_eventfd::wait()::{lambda()#1}&&)::{lambda()#1}::operator()() const::{lambda(seastar::internal::promise_base_with_type<unsigned long>&, seastar::future_state<>&&)#1}>::run_and_dispose() at /data/work/ceph-master/src/seastar/include/seastar/core/future.hh:505
seastar::reactor::run_tasks(seastar::reactor::task_queue&) at /data/work/ceph-master/src/seastar/src/core/reactor.cc:2151
seastar::reactor::run_some_tasks() at /data/work/ceph-master/src/seastar/src/core/reactor.cc:2566
seastar::reactor::run_some_tasks() at /data/work/ceph-master/src/seastar/src/core/reactor.cc:2549
 (inlined by) seastar::reactor::run() at /data/work/ceph-master/src/seastar/src/core/reactor.cc:2721
seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) at /data/work/ceph-master/src/seastar/src/core/app-template.cc:199
main at /data/work/ceph-master/src/crimson/osd/main.cc:148
__libc_start_main at ??:?
_start at ??:?

bt from gdb:

Core was generated by `/data/work/ceph-master/build/bin/crimson-osd -i 0 -c /data/work/ceph-master/bui'.
Program terminated with signal 11, Segmentation fault.
#0  0x00000000013d60be in load (__m=std::memory_order_relaxed, this=<optimized out>) at /data/work/ceph-master/src/seastar/src/core/memory.cc:803
803        auto old = list.load(std::memory_order_relaxed);
Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-13.el7.x86_64 cryptopp-5.6.2-10.el7.x86_64 elfutils-libelf-0.168-8.el7.x86_64 elfutils-libs-0.168-8.el7.x86_64 glibc-2.17-292.el7.x86_64 gmp-6.0.0-15.el7.x86_64 gnutls-3.3.29-9.el7_6.x86_64 leveldb-1.12.0-11.el7.x86_64 libaio-0.3.109-13.el7.x86_64 libattr-2.4.46-13.el7.x86_64 libblkid-2.23.2-63.el7.x86_64 libcap-2.22-9.el7.x86_64 libffi-3.0.13-19.el7.x86_64 libgcc-4.8.5-39.el7.x86_64 libstdc++-4.8.5-39.el7.x86_64 libtasn1-4.10-1.el7.x86_64 libuuid-2.23.2-63.el7.x86_64 lksctp-tools-1.0.17-2.el7.x86_64 lttng-ust-2.4.1-4.el7.x86_64 lz4-1.7.5-3.el7.x86_64 nettle-2.7.1-8.el7.x86_64 numactl-libs-2.0.12-5.el7.x86_64 openssl-libs-1.0.2k-8.el7.x86_64 p11-kit-0.23.5-3.el7.x86_64 protobuf-2.5.0-8.el7.x86_64 snappy-1.1.0-3.el7.x86_64 systemd-libs-219-73.el7_8.5.x86_64 userspace-rcu-0.7.16-1.el7.x86_64 xz-libs-5.2.2-1.el7.x86_64 yaml-cpp-0.5.1-2.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0  0x00000000013d60be in load (__m=std::memory_order_relaxed, this=<optimized out>) at /data/work/ceph-master/src/seastar/src/core/memory.cc:803
#1  load (__m=std::memory_order_relaxed, this=<optimized out>) at /opt/rh/devtoolset-8/root/usr/include/c++/8/atomic:452
#2  free_cross_cpu (this=0x7f65aad58300, ptr=0x602b7e0a2af0, cpu_id=<optimized out>) at /data/work/ceph-master/src/seastar/src/core/memory.cc:803
#3  seastar::memory::cpu_pages::free_cross_cpu (this=0x7f65aad58300, cpu_id=2, ptr=0x602b7e0a2af0) at /data/work/ceph-master/src/seastar/src/core/memory.cc:795
#4  0x0000000001404c98 in seastar::continuation<seastar::internal::promise_base_with_type<unsigned long>, seastar::future<unsigned long> seastar::future<>::then_impl_nrvo<seastar::readable_eventfd::wait()::{lambda()#1}, seastar::future<unsigned long> >(seastar::readable_eventfd::wait()::{lambda()#1}&&)::{lambda()#1}::operator()() const::{lambda(seastar::internal::promise_base_with_type<unsigned long>&, seastar::future_state<>&&)#1}>::run_and_dispose() () at /data/work/ceph-master/src/seastar/include/seastar/core/future.hh:337
#5  0x00000000013e98a9 in seastar::reactor::run_tasks (this=this@entry=0x600000496000, tq=...) at /data/work/ceph-master/src/seastar/src/core/reactor.cc:2151
#6  0x00000000013e9bf4 in seastar::reactor::run_some_tasks (this=this@entry=0x600000496000) at /data/work/ceph-master/src/seastar/src/core/reactor.cc:2566
#7  0x0000000001416836 in run_some_tasks (this=0x600000496000) at /data/work/ceph-master/src/seastar/src/core/reactor.cc:2721
#8  seastar::reactor::run() () at /data/work/ceph-master/src/seastar/src/core/reactor.cc:2721
#9  0x00000000013c0215 in seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) () at /data/work/ceph-master/src/seastar/include/seastar/core/reactor.hh:736
#10 0x00000000008a297b in main () at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_function.h:87
#11 0x00007f65a7223505 in __libc_start_main () from /lib64/libc.so.6
#12 0x00000000008ece77 in _start () at /opt/rh/devtoolset-8/root/usr/include/c++/8/system_error:205
Actions #1

Updated by Kefu Chai almost 4 years ago

it's a known issue, i'd suggest use -DCMAKE_BUILD_TYPE=Debug or -DCMAKE_BUILD_TYPE=Sanitize for using alien bluestore backend.

Actions #2

Updated by Kefu Chai almost 4 years ago

https://github.com/ceph/ceph/pull/35561 is created to disable the combination of bluestore + builtin-allocator.

Actions #3

Updated by Kefu Chai almost 4 years ago

  • Project changed from RADOS to crimson
Actions #4

Updated by Kefu Chai almost 4 years ago

  • Status changed from New to Triaged
Actions #6

Updated by Kefu Chai about 3 years ago

  • Status changed from Triaged to Resolved

should have been fixed. As we’ve picked up a fix in seastar which mows allows us to use the libc allocator in alien threads along with the sea star’s luckless allocator in seastar’s reactors.

Actions

Also available in: Atom PDF