Project

General

Profile

Bug #13835

on_flushed: object 1/f9b8d3f7/burnupi0951139-337/head obc still alive

Added by Sage Weil over 8 years ago. Updated about 8 years ago.

Status:
Can't reproduce
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rados
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

   -39> 2015-11-18 19:01:07.786124 7f9694d91700 10 osd.1 pg_epoch: 32 pg[0.c( empty local-les=11 n=0 ec=1 les/c/f 11/11/0 32/32/3) [0,4,3] r=-1 lpr=32 pi=3-31/2 crt=0'0 inactive NOTIFY NIBBLEWISE] handle_peering_event: epoch_sent: 32 epoch_requested: 32 FlushedEvt
   -38> 2015-11-18 19:01:07.786136 7f9694d91700 15 osd.1 pg_epoch: 32 pg[0.c( empty local-les=11 n=0 ec=1 les/c/f 11/11/0 32/32/3) [0,4,3] r=-1 lpr=32 pi=3-31/2 crt=0'0 inactive NOTIFY NIBBLEWISE]  requeue_ops 
   -37> 2015-11-18 19:01:07.786146 7f9694d91700 10 log is not dirty
   -36> 2015-11-18 19:01:07.786148 7f9694d91700 10 osd.1 pg_epoch: 32 pg[0.3( empty local-les=25 n=0 ec=1 les/c/f 25/25/0 32/32/24) [0,2,4] r=-1 lpr=32 pi=3-31/3 crt=0'0 inactive NOTIFY] handle_peering_event: epoch_sent: 32 epoch_requested: 32 FlushedEvt
   -35> 2015-11-18 19:01:07.786156 7f9694d91700 15 osd.1 pg_epoch: 32 pg[0.3( empty local-les=25 n=0 ec=1 les/c/f 25/25/0 32/32/24) [0,2,4] r=-1 lpr=32 pi=3-31/3 crt=0'0 inactive NOTIFY]  requeue_ops 
   -34> 2015-11-18 19:01:07.786162 7f9694d91700 10 log is not dirty
   -33> 2015-11-18 19:01:07.786163 7f9694d91700 10 osd.1 pg_epoch: 32 pg[1.2( v 29'134 (0'0,29'134] local-les=11 n=95 ec=5 les/c/f 11/11/0 32/32/5) [0,4,1] r=2 lpr=32 pi=5-31/2 crt=28'130 lcod 29'133 inactive NOTIFY NIBBLEWISE] handle_peering_event: epoch_sent: 32 epoch_requested: 32 FlushedEvt
   -32> 2015-11-18 19:01:07.786171 7f9694d91700 15 osd.1 pg_epoch: 32 pg[1.2( v 29'134 (0'0,29'134] local-les=11 n=95 ec=5 les/c/f 11/11/0 32/32/5) [0,4,1] r=2 lpr=32 pi=5-31/2 crt=28'130 lcod 29'133 inactive NOTIFY NIBBLEWISE]  requeue_ops 
   -31> 2015-11-18 19:01:07.786178 7f9694d91700 10 log is not dirty
   -30> 2015-11-18 19:01:07.786179 7f9694d91700 10 osd.1 pg_epoch: 32 pg[2.3( v 31'493 (0'0,31'493] local-les=11 n=65 ec=6 les/c/f 11/11/0 32/32/32) [4,5,1] r=2 lpr=32 pi=10-31/1 crt=29'488 lcod 31'492 inactive NOTIFY] handle_peering_event: epoch_sent: 32 epoch_requested: 32 FlushedEvt
   -29> 2015-11-18 19:01:07.786186 7f9694d91700 15 osd.1 pg_epoch: 32 pg[2.3( v 31'493 (0'0,31'493] local-les=11 n=65 ec=6 les/c/f 11/11/0 32/32/32) [4,5,1] r=2 lpr=32 pi=10-31/1 crt=29'488 lcod 31'492 inactive NOTIFY]  requeue_ops 
   -28> 2015-11-18 19:01:07.786195 7f9694d91700 10 log is not dirty
   -27> 2015-11-18 19:01:07.786197 7f9694d91700 10 osd.1 pg_epoch: 32 pg[1.3( v 31'171 (0'0,31'171] local-les=31 n=110 ec=5 les/c/f 31/25/0 32/32/32) [4,1,2] r=1 lpr=32 pi=22-31/2 crt=28'164 lcod 28'164 inactive NOTIFY] handle_peering_event: epoch_sent: 32 epoch_requested: 32 FlushedEvt
   -26> 2015-11-18 19:01:07.786206 7f9694d91700 15 osd.1 pg_epoch: 32 pg[1.3( v 31'171 (0'0,31'171] local-les=31 n=110 ec=5 les/c/f 31/25/0 32/32/32) [4,1,2] r=1 lpr=32 pi=22-31/2 crt=28'164 lcod 28'164 inactive NOTIFY]  requeue_ops 
   -25> 2015-11-18 19:01:07.786214 7f9694d91700 -1 osd.1 pg_epoch: 32 pg[1.3( v 31'171 (0'0,31'171] local-les=31 n=110 ec=5 les/c/f 31/25/0 32/32/32) [4,1,2] r=1 lpr=32 pi=22-31/2 crt=28'164 lcod 28'164 inactive NOTIFY] on_flushed: object 1/f9b8d3f7/burnupi0951139-337/head obc still alive
     0> 2015-11-18 19:01:07.805950 7f9694d91700 -1 osd/ReplicatedPG.cc: In function 'virtual void ReplicatedPG::on_flushed()' thread 7f9694d91700 time 2015-11-18 19:01:07.786225
osd/ReplicatedPG.cc: 9547: FAILED assert(object_contexts.empty())

 ceph version 9.2.0-781-gb16fc2e (b16fc2eb9720ece54067a248bb67ba2516b73df0)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x7f96b2a80c8b]
 2: (ReplicatedPG::get_watchers(std::list<obj_watch_item_t, std::allocator<obj_watch_item_t> >&)+0) [0x7f96b25d39b0]
 3: (PG::RecoveryState::Started::react(PG::FlushedEvt const&)+0x3f) [0x7f96b254eecf]
 4: (boost::statechart::simple_state<PG::RecoveryState::Started, PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Start, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x1f0) [0x7f96b25ca3e0]
 5: (boost::statechart::simple_state<PG::RecoveryState::Stray, PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost:
:statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x1a9) [0x7f96b25c7879]
 6: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::send_event(boost::statechart::event_base const&)+0x5b) [0x7f96b25b1eeb]
 7: (PG::handle_peering_event(std::shared_ptr<PG::CephPeeringEvt>, PG::RecoveryCtx*)+0x1ce) [0x7f96b255b53e]
 8: (OSD::process_peering_events(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x267) [0x7f96b24e4e97]
 9: (OSD::PeeringWQ::_process(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x18) [0x7f96b2529ab8]
 10: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa56) [0x7f96b2a72706]
 11: (ThreadPool::WorkThread::entry()+0x10) [0x7f96b2a735d0]
 12: (()+0x8182) [0x7f96b1061182]
 13: (clone()+0x6d) [0x7f96af3a847d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

/a/sage-2015-11-18_06:22:44-rados:thrash-wip-bigbang---basic-multi/1154646

Related issues

Related to Ceph - Bug #15151: osd: failed assert(object_contexts.empty()) Resolved 03/15/2016
Related to Ceph - Bug #18993: osd/PrimaryLogPG.cc: 9888: FAILED assert(object_contexts.empty()) Resolved 02/20/2017

History

#1 Updated by Yuri Weinstein about 8 years ago

Looks similar in run:
http://pulpito.ceph.com/teuthology-2016-02-16_09:00:01-rados-hammer-distro-basic-vps/
Job: 11903
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2016-02-16_09:00:01-rados-hammer-distro-basic-vps/11903/teuthology.log

2016-02-16T09:47:51.270 INFO:tasks.ceph.osd.1.vpm078.stderr:osd/ReplicatedPG.cc: In function 'virtual void ReplicatedPG::on_flushed()' thread 7fa1b4c06700 time 2016-02-16 17:47:50.408926
2016-02-16T09:47:51.270 INFO:tasks.ceph.osd.1.vpm078.stderr:osd/ReplicatedPG.cc: 8753: FAILED assert(object_contexts.empty())
2016-02-16T09:47:51.304 INFO:tasks.rados.rados.0.vpm078.stdout:5113:  writing vpm07815633-5113 from 1771686 to 2257917 tid 2
2016-02-16T09:47:51.334 INFO:tasks.rados.rados.0.vpm078.stdout:5113:  writing vpm07815633-5113 from 3018992 to 3354196 tid 3
2016-02-16T09:47:51.336 INFO:tasks.rados.rados.0.vpm078.stdout: waiting on 16
2016-02-16T09:47:51.392 INFO:tasks.ceph.osd.1.vpm078.stderr: ceph version 0.94.5-305-g7abb6ae (7abb6ae8f3cba67009bd022aaeee0a87cdfc6477)
2016-02-16T09:47:51.392 INFO:tasks.ceph.osd.1.vpm078.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0xba89eb]
2016-02-16T09:47:51.393 INFO:tasks.ceph.osd.1.vpm078.stderr: 2: (ReplicatedPG::on_flushed()+0x2d1) [0x827981]
2016-02-16T09:47:51.393 INFO:tasks.ceph.osd.1.vpm078.stderr: 3: (PG::RecoveryState::Started::react(PG::FlushedEvt const&)+0x3f) [0x7a5b6f]
2016-02-16T09:47:51.393 INFO:tasks.ceph.osd.1.vpm078.stderr: 4: (boost::statechart::simple_state<PG::RecoveryState::Started, PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Start, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x1f0) [0x81ddf0]
2016-02-16T09:47:51.393 INFO:tasks.ceph.osd.1.vpm078.stderr: 5: (boost::statechart::simple_state<PG::RecoveryState::Stray, PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x1a9) [0x81b119]
2016-02-16T09:47:51.394 INFO:tasks.ceph.osd.1.vpm078.stderr: 6: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::send_event(boost::statechart::event_base const&)+0x5b) [0x80686b]
2016-02-16T09:47:51.394 INFO:tasks.ceph.osd.1.vpm078.stderr: 7: (PG::handle_peering_event(std::tr1::shared_ptr<PG::CephPeeringEvt>, PG::RecoveryCtx*)+0x1ce) [0x7b3f7e]
2016-02-16T09:47:51.394 INFO:tasks.ceph.osd.1.vpm078.stderr: 8: (OSD::process_peering_events(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x2c0) [0x6a6c10]
2016-02-16T09:47:51.394 INFO:tasks.ceph.osd.1.vpm078.stderr: 9: (OSD::PeeringWQ::_process(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x18) [0x6ff7d8]
2016-02-16T09:47:51.395 INFO:tasks.ceph.osd.1.vpm078.stderr: 10: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa56) [0xb995d6]
2016-02-16T09:47:51.395 INFO:tasks.ceph.osd.1.vpm078.stderr: 11: (ThreadPool::WorkThread::entry()+0x10) [0xb9a680]
2016-02-16T09:47:51.395 INFO:tasks.ceph.osd.1.vpm078.stderr: 12: (()+0x8182) [0x7fa1cdaa2182]
2016-02-16T09:47:51.395 INFO:tasks.ceph.osd.1.vpm078.stderr: 13: (clone()+0x6d) [0x7fa1cc00d47d]
2016-02-16T09:47:51.396 INFO:tasks.ceph.osd.1.vpm078.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

#2 Updated by Yuri Weinstein about 8 years ago

  • Release set to hammer
  • ceph-qa-suite rados added

#3 Updated by Josh Durgin about 8 years ago

  • Related to Bug #15151: osd: failed assert(object_contexts.empty()) added

#4 Updated by Sage Weil about 8 years ago

  • Status changed from 12 to Need More Info
  • Assignee deleted (Sage Weil)
  • Priority changed from Urgent to High

fixed a similar crash in jewel, but it's probably not the same bug as these ones. need logs.

#5 Updated by Sage Weil about 8 years ago

  • Status changed from Need More Info to Can't reproduce

#6 Updated by Kefu Chai about 7 years ago

  • Related to Bug #18993: osd/PrimaryLogPG.cc: 9888: FAILED assert(object_contexts.empty()) added

Also available in: Atom PDF