Actions
Bug #532
closedOSD: repop_queue.front() == repop
Status:
Closed
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
On two of my OSD's I had the following crash:
Core was generated by `/usr/bin/cosd -i 3 -c /etc/ceph/ceph.conf'. Program terminated with signal 11, Segmentation fault. #0 0x0000000000000000 in ?? () (gdb) bt #0 0x0000000000000000 in ?? () #1 0x00000000005d97c1 in sigabrt_handler (signum=6) at config.cc:238 #2 <signal handler called> #3 0x00007fce0c446a75 in raise () from /lib/libc.so.6 #4 0x00007fce0c44a5c0 in abort () from /lib/libc.so.6 #5 0x00007fce0ccfc8e5 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/libstdc++.so.6 #6 0x00007fce0ccfad16 in ?? () from /usr/lib/libstdc++.so.6 #7 0x00007fce0ccfad43 in std::terminate() () from /usr/lib/libstdc++.so.6 #8 0x00007fce0ccfae3e in __cxa_throw () from /usr/lib/libstdc++.so.6 #9 0x00000000005c7098 in ceph::__ceph_assert_fail (assertion=0x5f2e03 "repop_queue.front() == repop", file=<value optimized out>, line=2024, func=<value optimized out>) at common/assert.cc:30 #10 0x0000000000479e72 in ReplicatedPG::eval_repop (this=0x2585700, repop=0x2e80d20) at osd/ReplicatedPG.cc:2024 #11 0x000000000047ccda in ReplicatedPG::op_applied (this=0x2585700, repop=0x2e80d20) at osd/ReplicatedPG.cc:1914 #12 0x00000000004b7a61 in C_OSD_OpApplied::finish(int) () #13 0x00000000005c60b8 in Finisher::finisher_thread_entry (this=0xe125f8) at common/Finisher.cc:54 #14 0x000000000046e73a in Thread::_entry_func (arg=0x5125) at ./common/Thread.h:39 #15 0x00007fce0d5419ca in start_thread () from /lib/libpthread.so.0 #16 0x00007fce0c4f96fd in clone () from /lib/libc.so.6 #17 0x0000000000000000 in ?? ()
osd5 (node06) also went down with a message about repop_queue.front() == repop.
I have no clue what could have triggered this, the cluster just had a fresh mkcephfs, so I have no idea how to reproduce it.
I've used cdebugpack to gather the relevant information, both packs have been uploaded to logger.ceph.widodh.nl:/srv/ceph/issues/osd_crash_repop_queue
Restarting the OSD's goes fine, they don't crash again.
Actions