Bug #36250
closedceph-osd process crashing
0%
Description
ceph-osd process crashes in thread msgr-worker. This happens with all OSDs in the cluster, roughly once per day at the peak frequency. It does seem to happen more often during evening/overnight hours when there is more load on the cluster. Originally posted on ml: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-September/030040.html
Version 12.2.2
From the log:
Sep 28 00:30:10 sn02 ceph-osd192103: 2018-09-28 00:30:10.399237 7fb5031f6700 -1 ** Caught signal (Aborted) *
in thread 7fb5031f6700 thread_name:msgr-worker-0
Stack:
#0 0x00007f9e738764ab in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
#1 0x000055925e1edab6 in reraise_fatal (signum=6) at /usr/src/debug/ceph-12.2.2/src/global/signal_handler.cc:74
#2 handle_fatal_signal (signum=6) at /usr/src/debug/ceph-12.2.2/src/global/signal_handler.cc:138
#3 <signal handler called>
#4 0x00007f9e7289f1f7 in _GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#5 0x00007f9e728a08e8 in __GI_abort () at abort.c:90
#6 0x00007f9e731a5ac5 in __gnu_cxx::_verbose_terminate_handler () at ../../../../libstdc++-v3/libsupc++/vterminate.cc:95
#7 0x00007f9e731a3a36 in _cxxabiv1::_terminate (handler=<optimized out>) at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:38
#8 0x00007f9e731a3a63 in std::terminate () at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:48
#9 0x00007f9e731fa345 in std::(anonymous namespace)::execute_native_thread_routine (__p=<optimized out>) at ../../../../../libstdc++-v3/src/c++11/thread.cc:92
#10 0x00007f9e7386ee25 in start_thread (arg=0x7f9e6ff94700) at pthread_create.c:308
#11 0x00007f9e7296234d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
I've uploaded the log from a process which crashed, debug_ms set to 5 long before the crash happened. Id from ceph-post-file: 83aa1468-7dc5-401a-82fd-22c344322efe