Project

General

Profile

Actions

Bug #339

closed

OSD crash: ReplicatedPG::sub_op_modify

Added by Wido den Hollander over 13 years ago. Updated about 13 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Two OSD's got killed by the OOM killer, after restarting both (osd4 and osd5), one crash with the following message:

Core was generated by `/usr/bin/cosd -i 5 -c /etc/ceph/ceph.conf'.
Program terminated with signal 6, Aborted.
#0  0x00007f9bdc464a75 in raise () from /lib/libc.so.6
(gdb) bt
#0  0x00007f9bdc464a75 in raise () from /lib/libc.so.6
#1  0x00007f9bdc4685c0 in abort () from /lib/libc.so.6
#2  0x00007f9bdcd198e5 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/libstdc++.so.6
#3  0x00007f9bdcd17d16 in ?? () from /usr/lib/libstdc++.so.6
#4  0x00007f9bdcd17d43 in std::terminate() () from /usr/lib/libstdc++.so.6
#5  0x00007f9bdcd17e3e in __cxa_throw () from /usr/lib/libstdc++.so.6
#6  0x00000000005c02b8 in ceph::__ceph_assert_fail (assertion=0x5eb240 "!missing.is_missing(soid)", 
    file=<value optimized out>, line=2776, func=<value optimized out>) at common/assert.cc:30
#7  0x00000000004905e9 in ReplicatedPG::sub_op_modify (this=<value optimized out>, op=0x7f9bc402a810)
    at osd/ReplicatedPG.cc:2776
#8  0x00000000004d94a4 in OSD::dequeue_op (this=0xe64120, pg=0xfc8ba0) at osd/OSD.cc:4740
#9  0x00000000005c097f in ThreadPool::worker (this=0xe64600) at common/WorkQueue.cc:44
#10 0x00000000004f89ad in ThreadPool::WorkThread::entry() ()
#11 0x000000000046d32a in Thread::_entry_func (arg=0x4cc0) at ./common/Thread.h:39
#12 0x00007f9bdd2f79ca in start_thread () from /lib/libpthread.so.0
#13 0x00007f9bdc5176cd in clone () from /lib/libc.so.6
#14 0x0000000000000000 in ?? ()
(gdb) 

During this the cluster was degraded since the OSD's had been down for some time.

I've uploaded the logs, core and binary to logger.ceph.widodh.nl into /srv/ceph/issues/osd_crash_ReplicatedPG_sub_op_modify

After this crash i tried to start the OSD again with a higher loglevel (20), but it didn't crash again.

Actions

Also available in: Atom PDF