Actions
Bug #15241
closedosd/osd_types.cc: 3106: FAILED assert(i.first <= i.last)
% Done:
0%
Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
osd/osd_types.cc: 3106: FAILED assert(i.first <= i.last)
ceph version 10.0.5-2638-g7972c10 (7972c105f25fab739e5b0d58bab36837a94ec86e)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x7f0a74bc94cb]
2: (pg_interval_t::check_new_interval(int, int, std::vector<int, std::allocator<int> > const&, std::vector<int, std::allocator<int> > const&, int, int, std::vector<int, std::allocator<int> > const&, std::vector<int, std::allocator<int> > const&, unsigned int, unsigned int, std::shared_ptr<OSDMap const>, std::shared_ptr<OSDMap const>, pg_t, IsPGRecoverablePredicate*, std::map<unsigned int, pg_interval_t, std::less<unsigne
d int>, std::allocator<std::pair<unsigned int const, pg_interval_t> > >, std::ostream)+0x786) [0x7f0a748057c6]
3: (PG::start_peering_interval(std::shared_ptr<OSDMap const>, std::vector<int, std::allocator<int> > const&, int, std::vector<int, std::allocator<int> > const&, int, ObjectStore::Transaction*)+0x404) [0x7f0a7464edf4]
4: (PG::RecoveryState::Reset::react(PG::AdvMap const&)+0x4f0) [0x7f0a74650230]
5: (boost::statechart::simple_state<PG::RecoveryState::Reset, PG::RecoveryState::RecoveryMachine, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x1fc) [0
x7f0a74683cbc]
6: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::send_event(boost::statechart::event_base const&)+0x5b) [0x7f0a7466af8b]
7: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::process_queued_events()+0xd8) [0x7f0a7466b0f8]
8: (PG::handle_advance_map(std::shared_ptr<OSDMap const>, std::shared_ptr<OSDMap const>, std::vector<int, std::allocator<int> >&, int, std::vector<int, std::allocator<int> >&, int, PG::RecoveryCtx*)+0x499) [0x7f0a74644d89]
9: (OSD::advance_pg(unsigned int, PG*, ThreadPool::TPHandle&, PG::RecoveryCtx*, std::set<boost::intrusive_ptr<PG>, std::less<boost::intrusive_ptr<PG> >, std::allocator<boost::intrusive_ptr<PG> > >)+0x2b9) [0x7f0a7457f799]
10: (OSD::process_peering_events(std::list<PG, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x1a1) [0x7f0a74593111]
11: (OSD::PeeringWQ::_process(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x12) [0x7f0a745db7e2]
12: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa5e) [0x7f0a74bbaa2e]
13: (ThreadPool::WorkThread::entry()+0x10) [0x7f0a74bbb910]
14: (()+0x8182) [0x7f0a72e8f182]
15: (clone()+0x6d) [0x7f0a70fbd47d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
sjust@teuthology:/a/samuelj-2016-03-21_13:58:56-rados-wip-sam-testing-distro-basic-smithi/78333/remote
Introduced with 17f810573c39a243a4175912fd6c42ebe9ceba41, I think.
The problem seems to be that normally history and orig_history in handle_pg_peering_evt must have the same same_interval_since (or we'd have bailed). The new same_primary param means history and orig_history may have different same_interval_since. check_new_interval then trips over itself since same_interval_since is already set ahead of our current map.
Actions