Project

General

Profile

Actions

Bug #15241

closed

osd/osd_types.cc: 3106: FAILED assert(i.first <= i.last)

Added by Samuel Just about 8 years ago. Updated about 8 years ago.

Status:
Resolved
Priority:
Immediate
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

osd/osd_types.cc: 3106: FAILED assert(i.first <= i.last)

ceph version 10.0.5-2638-g7972c10 (7972c105f25fab739e5b0d58bab36837a94ec86e)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x7f0a74bc94cb]
2: (pg_interval_t::check_new_interval(int, int, std::vector&lt;int, std::allocator&lt;int&gt; > const&, std::vector&lt;int, std::allocator&lt;int&gt; > const&, int, int, std::vector&lt;int, std::allocator&lt;int&gt; > const&, std::vector&lt;int, std::allocator&lt;int&gt; > const&, unsigned int, unsigned int, std::shared_ptr&lt;OSDMap const&gt;, std::shared_ptr&lt;OSDMap const&gt;, pg_t, IsPGRecoverablePredicate*, std::map&lt;unsigned int, pg_interval_t, std::less&lt;unsigne
d int>, std::allocator&lt;std::pair&lt;unsigned int const, pg_interval_t&gt; > >, std::ostream)+0x786) [0x7f0a748057c6]
3: (PG::start_peering_interval(std::shared_ptr&lt;OSDMap const&gt;, std::vector&lt;int, std::allocator&lt;int&gt; > const&, int, std::vector&lt;int, std::allocator&lt;int&gt; > const&, int, ObjectStore::Transaction*)+0x404) [0x7f0a7464edf4]
4: (PG::RecoveryState::Reset::react(PG::AdvMap const&)+0x4f0) [0x7f0a74650230]
5: (boost::statechart::simple_state&lt;PG::RecoveryState::Reset, PG::RecoveryState::RecoveryMachine, boost::mpl::list&lt;mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na&gt;, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x1fc) [0
x7f0a74683cbc]
6: (boost::statechart::state_machine&lt;PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator&lt;void&gt;, boost::statechart::null_exception_translator>::send_event(boost::statechart::event_base const&)+0x5b) [0x7f0a7466af8b]
7: (boost::statechart::state_machine&lt;PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator&lt;void&gt;, boost::statechart::null_exception_translator>::process_queued_events()+0xd8) [0x7f0a7466b0f8]
8: (PG::handle_advance_map(std::shared_ptr&lt;OSDMap const&gt;, std::shared_ptr&lt;OSDMap const&gt;, std::vector&lt;int, std::allocator&lt;int&gt; >&, int, std::vector&lt;int, std::allocator&lt;int&gt; >&, int, PG::RecoveryCtx*)+0x499) [0x7f0a74644d89]
9: (OSD::advance_pg(unsigned int, PG*, ThreadPool::TPHandle&, PG::RecoveryCtx*, std::set&lt;boost::intrusive_ptr&lt;PG&gt;, std::less&lt;boost::intrusive_ptr&lt;PG&gt; >, std::allocator&lt;boost::intrusive_ptr&lt;PG&gt; > >)+0x2b9) [0x7f0a7457f799]
10: (OSD::process_peering_events(std::list&lt;PG
, std::allocator&lt;PG*&gt; > const&, ThreadPool::TPHandle&)+0x1a1) [0x7f0a74593111]
11: (OSD::PeeringWQ::_process(std::list&lt;PG*, std::allocator&lt;PG*&gt; > const&, ThreadPool::TPHandle&)+0x12) [0x7f0a745db7e2]
12: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa5e) [0x7f0a74bbaa2e]
13: (ThreadPool::WorkThread::entry()+0x10) [0x7f0a74bbb910]
14: (()+0x8182) [0x7f0a72e8f182]
15: (clone()+0x6d) [0x7f0a70fbd47d]
NOTE: a copy of the executable, or `objdump -rdS &lt;executable&gt;` is needed to interpret this.

sjust@teuthology:/a/samuelj-2016-03-21_13:58:56-rados-wip-sam-testing-distro-basic-smithi/78333/remote

Introduced with 17f810573c39a243a4175912fd6c42ebe9ceba41, I think.

The problem seems to be that normally history and orig_history in handle_pg_peering_evt must have the same same_interval_since (or we'd have bailed). The new same_primary param means history and orig_history may have different same_interval_since. check_new_interval then trips over itself since same_interval_since is already set ahead of our current map.

Actions #1

Updated by Samuel Just about 8 years ago

  • Priority changed from Normal to Urgent
Actions #2

Updated by Samuel Just about 8 years ago

  • Status changed from In Progress to 7
Actions #3

Updated by Sage Weil about 8 years ago

  • Priority changed from Urgent to Immediate
Actions #4

Updated by Sage Weil about 8 years ago

  • Status changed from 7 to Resolved
Actions

Also available in: Atom PDF