Project

General

Profile

Actions

Bug #38124

closed

OSD down on snaptrim.

Added by Darius Kasparavičius over 5 years ago. Updated almost 5 years ago.

Status:
Resolved
Priority:
High
Assignee:
David Zafman
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
mimic, nautilus
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

All of ceph cluster OSD's crash when ceph runs snaptrim.

The particular error osd is throwing before crashing is:

2019-01-31 10:46:01.310 7fbb2fd45700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.4/rpm/el7/BUILD/ceph-13.2.4/src/osd/PrimaryLogPG.h: In function 'Pri
maryLogPG::Trimming::Trimming(boost::statechart::state<PrimaryLogPG::Trimming, PrimaryLogPG::SnapTrimmer, PrimaryLogPG::WaitReservation>::my_context)' thread 7fbb2fd45700 time 2019-01-31 10:46:01.306356
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.4/rpm/el7/BUILD/ceph-13.2.4/src/osd/PrimaryLogPG.h: 1571: FAILED assert(context< SnapTrimmer >().can_trim())

ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0xff) [0x7fbb59f3716f]
2: (()+0x25a337) [0x7fbb59f37337]
3: (PrimaryLogPG::NotTrimming::react(PrimaryLogPG::KickTrim const&)+0x783) [0x559a606efbc3]
4: (boost::statechart::simple_state&lt;PrimaryLogPG::NotTrimming, PrimaryLogPG::SnapTrimmer, boost::mpl::list&lt;mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na
, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0xa9) [0x559a6073b629]
5: (boost::statechart::state_machine&lt;PrimaryLogPG::SnapTrimmer, PrimaryLogPG::NotTrimming, std::allocator&lt;void&gt;, boost::statechart::null_exception_translator>::process_queued_events()+0xb3) [0x559a60715f23]
6: (boost::statechart::state_machine&lt;PrimaryLogPG::SnapTrimmer, PrimaryLogPG::NotTrimming, std::allocator&lt;void&gt;, boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&)+0x87) [0x559a60716187]
7: (_ZZN12PrimaryLogPG13WaitTrimTimerC4EN5boost10statechart5stateIS0_NS_8TrimmingENS1_3mpl4listIN4mpl_2naES8_S8_S8_S8_S8_S8_S8_S8_S8_S8_S8_S8_S8_S8_S8_S8_S8_S8_S8_EELNS2_12history_modeE0EE10my_contextEEN7OnTimer6finishEi()+0xb2) [0x559a607163a2]
8: (Context::complete(int)+0x9) [0x559a60580e49]
9: (SafeTimer::timer_thread()+0x18b) [0x7fbb59f33a8b]
10: (SafeTimerThread::entry()+0xd) [0x7fbb59f3504d]
11: (()+0x7dd5) [0x7fbb56ab8dd5]
12: (clone()+0x6d) [0x7fbb55ba8ead]
NOTE: a copy of the executable, or `objdump -rdS &lt;executable&gt;` is needed to interpret this.

The rest of the log is attached with the file.


Files

ceph-log.zip (282 KB) ceph-log.zip Darius Kasparavičius, 01/31/2019 11:58 AM
ceph-osd.tar.gz (754 KB) ceph-osd.tar.gz Darius Kasparavičius, 02/04/2019 10:19 PM

Related issues 2 (0 open2 closed)

Copied to RADOS - Backport #39698: mimic: OSD down on snaptrim.ResolvedDavid ZafmanActions
Copied to RADOS - Backport #39699: nautilus: OSD down on snaptrim.ResolvedDavid ZafmanActions
Actions

Also available in: Atom PDF