Bug #38124
closedOSD down on snaptrim.
0%
Description
All of ceph cluster OSD's crash when ceph runs snaptrim.
The particular error osd is throwing before crashing is:
2019-01-31 10:46:01.310 7fbb2fd45700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.4/rpm/el7/BUILD/ceph-13.2.4/src/osd/PrimaryLogPG.h: In function 'Pri
maryLogPG::Trimming::Trimming(boost::statechart::state<PrimaryLogPG::Trimming, PrimaryLogPG::SnapTrimmer, PrimaryLogPG::WaitReservation>::my_context)' thread 7fbb2fd45700 time 2019-01-31 10:46:01.306356
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.4/rpm/el7/BUILD/ceph-13.2.4/src/osd/PrimaryLogPG.h: 1571: FAILED assert(context< SnapTrimmer >().can_trim())
ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0xff) [0x7fbb59f3716f]
2: (()+0x25a337) [0x7fbb59f37337]
3: (PrimaryLogPG::NotTrimming::react(PrimaryLogPG::KickTrim const&)+0x783) [0x559a606efbc3]
4: (boost::statechart::simple_state<PrimaryLogPG::NotTrimming, PrimaryLogPG::SnapTrimmer, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na
, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0xa9) [0x559a6073b629]
5: (boost::statechart::state_machine<PrimaryLogPG::SnapTrimmer, PrimaryLogPG::NotTrimming, std::allocator<void>, boost::statechart::null_exception_translator>::process_queued_events()+0xb3) [0x559a60715f23]
6: (boost::statechart::state_machine<PrimaryLogPG::SnapTrimmer, PrimaryLogPG::NotTrimming, std::allocator<void>, boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&)+0x87) [0x559a60716187]
7: (_ZZN12PrimaryLogPG13WaitTrimTimerC4EN5boost10statechart5stateIS0_NS_8TrimmingENS1_3mpl4listIN4mpl_2naES8_S8_S8_S8_S8_S8_S8_S8_S8_S8_S8_S8_S8_S8_S8_S8_S8_S8_S8_EELNS2_12history_modeE0EE10my_contextEEN7OnTimer6finishEi()+0xb2) [0x559a607163a2]
8: (Context::complete(int)+0x9) [0x559a60580e49]
9: (SafeTimer::timer_thread()+0x18b) [0x7fbb59f33a8b]
10: (SafeTimerThread::entry()+0xd) [0x7fbb59f3504d]
11: (()+0x7dd5) [0x7fbb56ab8dd5]
12: (clone()+0x6d) [0x7fbb55ba8ead]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
The rest of the log is attached with the file.
Files