Project

General

Profile

Actions

Bug #38377

closed

OpTracker destruct assert when OSD destruct

Added by bing lin about 5 years ago. Updated almost 5 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
mimic
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

coredump

(gdb) bt
#0  0x00007f506d3b623b in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
#1  0x0000563eafeebba6 in reraise_fatal (signum=6) at /usr/src/debug/ceph-12.2.10-469-g57b4c2d/src/global/signal_handler.cc:74
#2  handle_fatal_signal (signum=6) at /usr/src/debug/ceph-12.2.10-469-g57b4c2d/src/global/signal_handler.cc:138
#3  <signal handler called>
#4  0x00007f506c3e01d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#5  0x00007f506c3e18c8 in __GI_abort () at abort.c:90
#6  0x0000563eaff2ab14 in ceph::__ceph_assert_fail (assertion=assertion@entry=0x563eb04695b8 "(sharded_in_flight_list.back())->ops_in_flight_sharded.empty()",
    file=file@entry=0x563eb04694e0 "/root/rpmbuild/BUILD/ceph-12.2.10-469-g57b4c2d/src/common/TrackedOp.cc", line=line@entry=153,
    func=func@entry=0x563eb0469ee0 <OpTracker::~OpTracker()::__PRETTY_FUNCTION__> "OpTracker::~OpTracker()") at /usr/src/debug/ceph-12.2.10-469-g57b4c2d/src/common/assert.cc:66
#7  0x0000563eafbd2ee8 in OpTracker::~OpTracker (this=0x563eba8f3150, __in_chrg=<optimized out>) at /usr/src/debug/ceph-12.2.10-469-g57b4c2d/src/common/TrackedOp.cc:153
#8  0x0000563eaf981332 in OSD::~OSD (this=0x563eba8f2000, __in_chrg=<optimized out>) at /usr/src/debug/ceph-12.2.10-469-g57b4c2d/src/osd/OSD.cc:2042
#9  0x0000563eaf9815c9 in OSD::~OSD (this=0x563eba8f2000, __in_chrg=<optimized out>) at /usr/src/debug/ceph-12.2.10-469-g57b4c2d/src/osd/OSD.cc:2052
#10 0x0000563eaf870b68 in main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/ceph-12.2.10-469-g57b4c2d/src/ceph_osd.cc:675

when Optracker destruct, ops_in_flight_sharded is not empty


Related issues 3 (0 open3 closed)

Has duplicate RADOS - Bug #38592: mon,osd: src/common/TrackedOp.cc: 163: FAILED ceph_assert((sharded_in_flight_list.back())->ops_in_flight_sharded.empty()) on shutdownDuplicate03/05/2019

Actions
Has duplicate RADOS - Bug #36546: common/TrackedOp.cc: 163: FAILED ceph_assert((sharded_in_flight_list.back())->ops_in_flight_sharded. empty())Duplicate10/22/2018

Actions
Copied to RADOS - Backport #38646: mimic: OpTracker destruct assert when OSD destructResolvedAshish SinghActions
Actions #1

Updated by Greg Farnum about 5 years ago

Is this a custom build? Where did it come from?

Actions #2

Updated by bing lin about 5 years ago

Greg Farnum wrote:

Is this a custom build? Where did it come from?

aha,ceph version is Luminous 12.2.10,
see [[https://github.com/ceph/ceph/pull/26504]]

1. waiting_for_osdmap is list of OpRequestRef, so when push OpRequest to waiting_for_osdmap, OpRequest ++nref,and when take_waiter, Op will be push to finished from waiting_for_osdmap, until Op take from finished and nref put to zero,OpRequest will unregister from optracker.

2. when osd shutdown, finished will be clear, that will put Op nref(--nref), but when op has not been take to finished, Op will still in waiting_for_osdmap, however OpTracker destruct befor waiting_for_osdmap, so that will cause OpTracker::~OpTracker assert((sharded_in_flight_list.back())->ops_in_flight_sharded.empty());
timeline below:
t1 push Op to waiting_for_osdmap(++nref)
t2 osd shutdown
t3 delete osd
t4 delete OpTracker
t5 got assert

Actions #3

Updated by Greg Farnum about 5 years ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 26504
Actions #4

Updated by Kefu Chai about 5 years ago

  • Backport set to nautilus,mimic
Actions #5

Updated by Sage Weil about 5 years ago

  • Has duplicate Bug #38592: mon,osd: src/common/TrackedOp.cc: 163: FAILED ceph_assert((sharded_in_flight_list.back())->ops_in_flight_sharded.empty()) on shutdown added
Actions #6

Updated by Sage Weil about 5 years ago

  • Priority changed from Normal to Urgent
Actions #7

Updated by Sage Weil about 5 years ago

  • Has duplicate Bug #36546: common/TrackedOp.cc: 163: FAILED ceph_assert((sharded_in_flight_list.back())->ops_in_flight_sharded. empty()) added
Actions #8

Updated by Sage Weil about 5 years ago

  • Status changed from Fix Under Review to Pending Backport
  • Priority changed from Urgent to High
Actions #9

Updated by Nathan Cutler about 5 years ago

  • Copied to Backport #38646: mimic: OpTracker destruct assert when OSD destruct added
Actions #10

Updated by Nathan Cutler about 5 years ago

  • Backport changed from nautilus,mimic to mimic

master is still being merged into nautilus AFAICT

Actions #11

Updated by Nathan Cutler almost 5 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF