Bug #13382
osd/PG.cc: 758: FAILED assert(info.history.same_interval_since == same_interval_since)
Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Q/A
Tags:
Backport:
hammer
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
-1655> 2015-10-05 22:36:47.385270 7fa2abfa8700 1 -- 10.214.130.57:6801/20854 <== osd.12 10.214.130.79:6828/13416 2 ==== pg_notify(1.9(2) epoch 109) v5 ==== 857+0+0 (2203112667 0 0) 0x7fa2c59d5c20 con 0x7fa2c61dde40 -1654> 2015-10-05 22:36:47.385277 7fa2abfa8700 10 osd.1 109 do_waiters -- start -1653> 2015-10-05 22:36:47.385279 7fa2abfa8700 10 osd.1 109 do_waiters -- finish -1652> 2015-10-05 22:36:47.385280 7fa2abfa8700 20 osd.1 109 _dispatch 0x7fa2c59d5c20 pg_notify(1.9(2) epoch 109) v5 -1651> 2015-10-05 22:36:47.385286 7fa2abfa8700 5 -- op tracker -- seq: 11, time: 2015-10-05 22:36:47.381874, event: header_read, op: pg_notify(1.9(2) epoch 109) -1650> 2015-10-05 22:36:47.385290 7fa2abfa8700 5 -- op tracker -- seq: 11, time: 2015-10-05 22:36:47.381874, event: throttled, op: pg_notify(1.9(2) epoch 109) -1649> 2015-10-05 22:36:47.385293 7fa2abfa8700 5 -- op tracker -- seq: 11, time: 2015-10-05 22:36:47.381896, event: all_read, op: pg_notify(1.9(2) epoch 109) -1648> 2015-10-05 22:36:47.385296 7fa2abfa8700 5 -- op tracker -- seq: 11, time: 2015-10-05 22:36:47.385276, event: dispatched, op: pg_notify(1.9(2) epoch 109) -1647> 2015-10-05 22:36:47.385300 7fa2abfa8700 7 osd.1 109 handle_pg_notify from osd.12 -1646> 2015-10-05 22:36:47.385302 7fa2abfa8700 15 osd.1 109 require_same_or_newer_map 109 (i am 109) 0x7fa2c59d5c20 -1645> 2015-10-05 22:36:47.385306 7fa2abfa8700 5 -- op tracker -- seq: 11, time: 2015-10-05 22:36:47.385305, event: started, op: pg_notify(1.9(2) epoch 109) -1644> 2015-10-05 22:36:47.385324 7fa2abfa8700 15 osd.1 109 project_pg_history 1.9 from 109 to 109, start ec=4 les/c/f 21/21/0 109/109/109 -1643> 2015-10-05 22:36:47.385329 7fa2abfa8700 15 osd.1 109 project_pg_history end ec=4 les/c/f 21/21/0 109/109/109 -1642> 2015-10-05 22:36:47.385377 7fa2abfa8700 20 osd.1 109 _create_lock_pg pgid 1.9 -1641> 2015-10-05 22:36:47.385381 7fa2abfa8700 10 osd.1 109 _open_lock_pg 1.9 -1640> 2015-10-05 22:36:47.385385 7fa2abfa8700 10 osd.1 109 _get_pool 1 -1639> 2015-10-05 22:36:47.385429 7fa2abfa8700 5 osd.1 pg_epoch: 109 pg[1.9(unlocked)] enter Initial -1638> 2015-10-05 22:36:47.385442 7fa2abfa8700 20 osd.1 pg_epoch: 109 pg[1.9(unlocked)] enter NotTrimming -1637> 2015-10-05 22:36:47.385451 7fa2abfa8700 10 osd.1 pg_epoch: 109 pg[1.9( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0) [] r=0 lpr=0 crt=0'0 inactive NIBBLEWISE] init role 0 up [1,4] acting [1,4] history ec=4 les/c/f 21/21/0 109/109/109 2 past_intervals -1636> 2015-10-05 22:36:47.385464 7fa2abfa8700 20 osd.1 pg_epoch: 109 pg[1.9( empty local-les=0 n=0 ec=4 les/c/f 21/21/0 109/109/109) [1,4] r=0 lpr=0 pi=99-108/2 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] _on_new_interval resorting object_contexts -1635> 2015-10-05 22:36:47.385498 7fa2abfa8700 5 write_log with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, dirty_divergent_priors: false, divergent_priors: 0, writeout_from: 4294967295'18446744073709551615, trimmed: -1634> 2015-10-05 22:36:47.385512 7fa2abfa8700 7 osd.1 109 _create_lock_pg pg[1.9( empty local-les=0 n=0 ec=4 les/c/f 21/21/0 109/109/109) [1,4] r=0 lpr=0 pi=99-108/2 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] -1633> 2015-10-05 22:36:47.385518 7fa2abfa8700 10 osd.1 pg_epoch: 109 pg[1.9( empty local-les=0 n=0 ec=4 les/c/f 21/21/0 109/109/109) [1,4] r=0 lpr=0 pi=99-108/2 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] handle_create -1632> 2015-10-05 22:36:47.385528 7fa2abfa8700 5 osd.1 pg_epoch: 109 pg[1.9( empty local-les=0 n=0 ec=4 les/c/f 21/21/0 109/109/109) [1,4] r=0 lpr=0 pi=99-108/2 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] exit Initial 0.000099 0 0.000000 -1631> 2015-10-05 22:36:47.385538 7fa2abfa8700 5 osd.1 pg_epoch: 109 pg[1.9( empty local-les=0 n=0 ec=4 les/c/f 21/21/0 109/109/109) [1,4] r=0 lpr=0 pi=99-108/2 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] enter Reset -1630> 2015-10-05 22:36:47.385545 7fa2abfa8700 20 osd.1 pg_epoch: 109 pg[1.9( empty local-les=0 n=0 ec=4 les/c/f 21/21/0 109/109/109) [1,4] r=0 lpr=0 pi=99-108/2 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] set_last_peering_reset 109 -1629> 2015-10-05 22:36:47.385551 7fa2abfa8700 10 osd.1 pg_epoch: 109 pg[1.9( empty local-les=0 n=0 ec=4 les/c/f 21/21/0 109/109/109) [1,4] r=0 lpr=109 pi=99-108/2 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] Clearing blocked outgoing recovery messages -1628> 2015-10-05 22:36:47.385558 7fa2abfa8700 10 osd.1 pg_epoch: 109 pg[1.9( empty local-les=0 n=0 ec=4 les/c/f 21/21/0 109/109/109) [1,4] r=0 lpr=109 pi=99-108/2 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] Not blocking outgoing recovery messages -1627> 2015-10-05 22:36:47.385567 7fa2abfa8700 10 osd.1 pg_epoch: 109 pg[1.9( empty local-les=0 n=0 ec=4 les/c/f 21/21/0 109/109/109) [1,4] r=0 lpr=109 pi=99-108/2 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] update_heartbeat_peers -> 1,4 -1626> 2015-10-05 22:36:47.385575 7fa2abfa8700 20 osd.1 109 need_heartbeat_peer_update -1625> 2015-10-05 22:36:47.385576 7fa2abfa8700 10 osd.1 pg_epoch: 109 pg[1.9( empty local-les=0 n=0 ec=4 les/c/f 21/21/0 109/109/109) [1,4] r=0 lpr=109 pi=99-108/2 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] take_waiters -1624> 2015-10-05 22:36:47.385584 7fa2abfa8700 5 osd.1 pg_epoch: 109 pg[1.9( empty local-les=0 n=0 ec=4 les/c/f 21/21/0 109/109/109) [1,4] r=0 lpr=109 pi=99-108/2 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] exit Reset 0.000045 1 0.000039 -1623> 2015-10-05 22:36:47.385593 7fa2abfa8700 5 osd.1 pg_epoch: 109 pg[1.9( empty local-les=0 n=0 ec=4 les/c/f 21/21/0 109/109/109) [1,4] r=0 lpr=109 pi=99-108/2 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] enter Started -1622> 2015-10-05 22:36:47.385601 7fa2abfa8700 5 osd.1 pg_epoch: 109 pg[1.9( empty local-les=0 n=0 ec=4 les/c/f 21/21/0 109/109/109) [1,4] r=0 lpr=109 pi=99-108/2 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] enter Start -1621> 2015-10-05 22:36:47.385607 7fa2abfa8700 1 osd.1 pg_epoch: 109 pg[1.9( empty local-les=0 n=0 ec=4 les/c/f 21/21/0 109/109/109) [1,4] r=0 lpr=109 pi=99-108/2 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] state<Start>: transitioning to Primary -1620> 2015-10-05 22:36:47.385616 7fa2abfa8700 5 osd.1 pg_epoch: 109 pg[1.9( empty local-les=0 n=0 ec=4 les/c/f 21/21/0 109/109/109) [1,4] r=0 lpr=109 pi=99-108/2 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] exit Start 0.000014 0 0.000000 -1619> 2015-10-05 22:36:47.385625 7fa2abfa8700 5 osd.1 pg_epoch: 109 pg[1.9( empty local-les=0 n=0 ec=4 les/c/f 21/21/0 109/109/109) [1,4] r=0 lpr=109 pi=99-108/2 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] enter Started/Primary -1618> 2015-10-05 22:36:47.385633 7fa2abfa8700 5 osd.1 pg_epoch: 109 pg[1.9( empty local-les=0 n=0 ec=4 les/c/f 21/21/0 109/109/109) [1,4] r=0 lpr=109 pi=99-108/2 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] enter Started/Primary/Peering -1617> 2015-10-05 22:36:47.385640 7fa2abfa8700 5 osd.1 pg_epoch: 109 pg[1.9( empty local-les=0 n=0 ec=4 les/c/f 21/21/0 109/109/109) [1,4] r=0 lpr=109 pi=99-108/2 crt=0'0 mlcod 0'0 peering NIBBLEWISE] enter Started/Primary/Peering/GetInfo -1616> 2015-10-05 22:36:47.385661 7fa2abfa8700 10 osd.1 pg_epoch: 109 pg[1.9( empty local-les=0 n=0 ec=4 les/c/f 21/21/0 109/109/109) [1,4] r=0 lpr=109 pi=99-108/2 crt=0'0 mlcod 0'0 peering NIBBLEWISE] generate_past_intervals over epochs 21-99 -224> 2015-10-05 22:36:47.463626 7fa2abfa8700 -1 osd/PG.cc: In function 'void PG::generate_past_intervals()' thread 7fa2abfa8700 time 2015-10-05 22:36:47.386050 osd/PG.cc: 758: FAILED assert(info.history.same_interval_since == same_interval_since) ceph version 9.0.3-2031-g42a2c34 (42a2c340e1a97fffe4ac0d975948005ea5ebded9) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x7fa2c16a8c7b] 2: (PG::generate_past_intervals()+0xbcc) [0x7fa2c12345dc] 3: (PG::RecoveryState::GetInfo::GetInfo(boost::statechart::state<PG::RecoveryState::GetInfo, PG::RecoveryState::Peering, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::my_context)+0x11d) [0x7fa2c12585cd] 4: (boost::statechart::state<PG::RecoveryState::Primary, PG::RecoveryState::Started, PG::RecoveryState::Peering, (boost::statechart::history_mode)0>::deep_construct(boost::intrusive_ptr<PG::RecoveryState::Started> const&, boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>&)+0xfa) [0x7fa2c129488a] 5: (boost::statechart::detail::safe_reaction_result boost::statechart::simple_state<PG::RecoveryState::Start, PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::transit_impl<PG::RecoveryState::Primary, PG::RecoveryState::RecoveryMachine, boost::statechart::detail::no_transition_function>(boost::statechart::detail::no_transition_function const&)+0x70) [0x7fa2c129cee0] 6: (boost::statechart::simple_state<PG::RecoveryState::Start, PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0xa2) [0x7fa2c129e502] 7: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::send_event(boost::statechart::event_base const&)+0x5b) [0x7fa2c1287adb] 8: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::process_queued_events()+0xd8) [0x7fa2c1287c58] 9: (PG::handle_create(PG::RecoveryCtx*)+0x174) [0x7fa2c1236784] 10: (OSD::handle_pg_peering_evt(spg_t, pg_info_t const&, std::map<unsigned int, pg_interval_t, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, pg_interval_t> > >&, unsigned int, pg_shard_t, bool, std::shared_ptr<PG::CephPeeringEvt>)+0x119c) [0x7fa2c112ae3c] 11: (OSD::handle_pg_notify(std::shared_ptr<OpRequest>)+0x612) [0x7fa2c112bf72] 12: (OSD::dispatch_op(std::shared_ptr<OpRequest>)+0xa8) [0x7fa2c112d068] 13: (OSD::_dispatch(Message*)+0x20f) [0x7fa2c112e10f] 14: (OSD::ms_dispatch(Message*)+0x20f) [0x7fa2c112e79f] 15: (DispatchQueue::entry()+0x63a) [0x7fa2c175efda] 16: (DispatchQueue::DispatchThread::entry()+0xd) [0x7fa2c168990d] 17: (()+0x8182) [0x7fa2bfa5f182] 18: (clone()+0x6d) [0x7fa2bdda647d]/a/sage-2015-10-05_14:30:41-upgrade:hammer-x-infernalis---basic-vps/1089801
Related issues
History
#1 Updated by Sage Weil over 8 years ago
#2 Updated by Sage Weil over 8 years ago
- Status changed from Fix Under Review to Pending Backport
- Backport set to hammer
#3 Updated by Loïc Dachary over 8 years ago
- Status changed from Pending Backport to Resolved
- Backport deleted (
hammer)
this fixes a bug introduced after hammer by https://github.com/ceph/ceph/commit/70316541bbb115d9a35954bfba373cf1dc084b7e ( see http://tracker.ceph.com/issues/10794 ) which has not been backported, therefore there is no need for a backport
#4 Updated by David Zafman about 8 years ago
- Status changed from Resolved to Pending Backport
- Backport set to hammer
#5 Updated by David Zafman about 8 years ago
- Related to Bug #10794: ceph_test_rados: incorrect buffer at pos ... added
#6 Updated by David Zafman about 8 years ago
- Related to Backport #14979: hammer: ceph_test_rados: incorrect buffer at pos ... added
#7 Updated by David Zafman about 8 years ago
Combined into pull https://github.com/ceph/ceph/pull/7917 using http://tracker.ceph.com/issues/14979
#8 Updated by Nathan Cutler about 8 years ago
- Copied to Backport #15207: hammer: osd/PG.cc: 758: FAILED assert(info.history.same_interval_since == same_interval_since) added
#9 Updated by Nathan Cutler about 8 years ago
#10 Updated by Nathan Cutler about 8 years ago
- Related to deleted (Backport #14979: hammer: ceph_test_rados: incorrect buffer at pos ...)
#11 Updated by Loïc Dachary almost 8 years ago
- Status changed from Pending Backport to Resolved