Project

General

Profile

Bug #18993

osd/PrimaryLogPG.cc: 9888: FAILED assert(object_contexts.empty())

Added by Kefu Chai over 3 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

/build/ceph-12.0.0-538-g8ac3230/src/osd/PrimaryLogPG.cc: 9888: FAILED assert(object_contexts.empty())

 ceph version 12.0.0-538-g8ac3230 (8ac32304f36c0caa334059229791c70469004005)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x10e) [0x55f54c610e3e]
 2: (()+0x588b9e) [0x55f54c224b9e]
 3: (PG::RecoveryState::Started::react(PG::FlushedEvt const&)+0x3f) [0x55f54c1821df]
 4: (boost::statechart::simple_state<PG::RecoveryState::Started, PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Start, (boost::statechart::history_mod
e)0>::react_impl(boost::statechart::event_base const&, void const*)+0x1f0) [0x55f54c208ee0]
 5: (boost::statechart::simple_state<PG::RecoveryState::Stray, PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost:
:statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x1a9) [0x55f54c206559]
 6: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_
translator>::send_event(boost::statechart::event_base const&)+0x5b) [0x55f54c1e7c4b]
 7: (PG::handle_peering_event(std::shared_ptr<PG::CephPeeringEvt>, PG::RecoveryCtx*)+0x1ce) [0x55f54c1b707e]
 8: (OSD::process_peering_events(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x25c) [0x55f54c11d89c]
 9: (OSD::PeeringWQ::_process(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x17) [0x55f54c16bd37]
 10: (ThreadPool::worker(ThreadPool::WorkThread*)+0xb65) [0x55f54c617915]
 11: (ThreadPool::WorkThread::entry()+0x10) [0x55f54c6188e0]
 12: (()+0x8184) [0x7f3179702184]
 13: (clone()+0x6d) [0x7f31787f237d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

so, seems the object_context for "mira11914383-561" is leaked

see /a/kchai-2017-02-18_17:57:32-rados-wip-kefu-testing---basic-mira/831180/remote/mira084/log/ceph-osd.4.log.gz


Related issues

Related to Ceph - Bug #13835: on_flushed: object 1/f9b8d3f7/burnupi0951139-337/head obc still alive Can't reproduce 11/19/2015
Related to Ceph - Bug #18927: on_flushed: object ... obc still alive Resolved 02/14/2017

History

#1 Updated by Kefu Chai over 3 years ago

-6945> 2017-02-19 00:35:28.425515 7f31585d3700 10 osd.4 pg_epoch: 285 pg[1.11( v 276'358 lc 265'344 (0'0,276'358] local-les=278 n=20 ec=11 les/c/f 278/244/0277/277/277) [4,0] r=0 lpr=277 pi=218-276/1 crt=276'358 mlcod 0'0 active+recovering+degraded m=4 snaptrimq=[72~1,b8~1,f1~3,f6~1]] recover_primary 1:8ab01bd0:::mira11914383-561 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:85 265'345 (missing)

-6944> 2017-02-19 00:35:28.425535 7f31585d3700 10 osd.4 pg_epoch: 285 pg[1.11( v 276'358 lc 265'344 (0'0,276'358] local-les=278 n=20 ec=11 les/c/f 278/244/0 277/277/277) [4,0] r=0 lpr=277 pi=218-276/1 crt=276'358 mlcod 0'0 active+recovering+degraded m=4 snaptrimq=[72~1,b8~1,f1~3,f6~1]] get_object_context: found obc in cache: 0x55f559bd7480

-1275> 2017-02-19 00:35:30.055275 7f315c5db700 10 osd.4 pg_epoch: 287 pg[1.21( v 280'463 (0'0,280'463] local-les=283 n=22 ec=17 les/c/f 283/230/0 287/287/287) [5,4] r=1 lpr=287 pi=212-286/2 crt=280'463 lcod 280'461 inactive] on_change

-954> 2017-02-19 00:35:30.056981 7f316dd2e700 15 filestore(/var/lib/ceph/osd/ceph-4) remove 1.11_head/#1:8ab01bd0:::mira11914383-561 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:85#

-931> 2017-02-19 00:35:30.057074 7f316dd2e700 20 filestore(/var/lib/ceph/osd/ceph-4) lfn_unlink: clearing omap on #1:8ab01bd0:::mira11914383-561 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:85# in cid 1.11_head

-922> 2017-02-19 00:35:30.057110 7f316dd2e700 10 filestore(/var/lib/ceph/osd/ceph-4) remove 1.11_head/#1:8ab01bd0:::mira11914383-561 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:85# = 0

-919> 2017-02-19 00:35:30.057122 7f316dd2e700 15 filestore(/var/lib/ceph/osd/ceph-4) _collection_move_rename 1.11_head/#1:8ab01bd0:::mira11914383-561 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:85# from 1.11_TEMP/#-3:88000000:::temp_recovering_1.11_265'345_277_85:head#

-812> 2017-02-19 00:35:30.057879 7f316dd2e700 10 filestore(/var/lib/ceph/osd/ceph-4) _collection_move_rename 1.11_head/#1:8ab01bd0:::mira11914383-561 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:85# from 1.11_TEMP/#-3:88000000:::temp_recovering_1.11_265'345_277_85:head# = 0

-127> 2017-02-19 00:35:30.068050 7f316ad28700 10 osd.4 pg_epoch: 287 pg[1.11( v 276'358 lc 265'346 (0'0,276'358] local-les=278 n=20 ec=11 les/c/f 278/244/0 287/287/287) [5,0] r=-1 lpr=287 pi=218-286/2 crt=276'358 lcod 265'344 inactive NOTIFY m=3] _applied_recovered_object obc(1:8ab01bd0:::mira11914383-561 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:85 rwstate(none n=0 w=0))

-109> 2017-02-19 00:35:30.068230 7f315c5db700 -1 osd.4 pg_epoch: 287 pg[1.11( v 276'358 lc 265'346 (0'0,276'358] local-les=278 n=20 ec=11 les/c/f 278/244/0 287/287/287) [5,0] r=-1 lpr=287 pi=218-286/2 crt=276'358 lcod 265'344 inactive NOTIFY m=3] on_flushed: object 1:8ab01bd0:::mira11914383-561 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:85 obc still alive

#2 Updated by Kefu Chai over 3 years ago

  • Related to Bug #13835: on_flushed: object 1/f9b8d3f7/burnupi0951139-337/head obc still alive added

#3 Updated by Kefu Chai over 3 years ago

and /a/kchai-2017-02-18_17:57:32-rados-wip-kefu-testing---basic-mira/831320

#4 Updated by Kefu Chai over 3 years ago

  • Priority changed from Normal to Urgent

#6 Updated by Josh Durgin over 3 years ago

  • Status changed from New to Resolved

Nevermind, got my git log wrong, that pr was not included in that run, and should fix this. Re-open if it appears again.

#7 Updated by Josh Durgin over 3 years ago

  • Related to Bug #18927: on_flushed: object ... obc still alive added

Also available in: Atom PDF