Project

General

Profile

Actions

Bug #532

closed

OSD: repop_queue.front() == repop

Added by Wido den Hollander over 13 years ago. Updated over 13 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

On two of my OSD's I had the following crash:

Core was generated by `/usr/bin/cosd -i 3 -c /etc/ceph/ceph.conf'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000000000000000 in ?? ()
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x00000000005d97c1 in sigabrt_handler (signum=6) at config.cc:238
#2  <signal handler called>
#3  0x00007fce0c446a75 in raise () from /lib/libc.so.6
#4  0x00007fce0c44a5c0 in abort () from /lib/libc.so.6
#5  0x00007fce0ccfc8e5 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/libstdc++.so.6
#6  0x00007fce0ccfad16 in ?? () from /usr/lib/libstdc++.so.6
#7  0x00007fce0ccfad43 in std::terminate() () from /usr/lib/libstdc++.so.6
#8  0x00007fce0ccfae3e in __cxa_throw () from /usr/lib/libstdc++.so.6
#9  0x00000000005c7098 in ceph::__ceph_assert_fail (assertion=0x5f2e03 "repop_queue.front() == repop", 
    file=<value optimized out>, line=2024, func=<value optimized out>) at common/assert.cc:30
#10 0x0000000000479e72 in ReplicatedPG::eval_repop (this=0x2585700, repop=0x2e80d20) at osd/ReplicatedPG.cc:2024
#11 0x000000000047ccda in ReplicatedPG::op_applied (this=0x2585700, repop=0x2e80d20) at osd/ReplicatedPG.cc:1914
#12 0x00000000004b7a61 in C_OSD_OpApplied::finish(int) ()
#13 0x00000000005c60b8 in Finisher::finisher_thread_entry (this=0xe125f8) at common/Finisher.cc:54
#14 0x000000000046e73a in Thread::_entry_func (arg=0x5125) at ./common/Thread.h:39
#15 0x00007fce0d5419ca in start_thread () from /lib/libpthread.so.0
#16 0x00007fce0c4f96fd in clone () from /lib/libc.so.6
#17 0x0000000000000000 in ?? ()

osd5 (node06) also went down with a message about repop_queue.front() == repop.

I have no clue what could have triggered this, the cluster just had a fresh mkcephfs, so I have no idea how to reproduce it.

I've used cdebugpack to gather the relevant information, both packs have been uploaded to logger.ceph.widodh.nl:/srv/ceph/issues/osd_crash_repop_queue

Restarting the OSD's goes fine, they don't crash again.

Actions

Also available in: Atom PDF