Bug #38377
closedOpTracker destruct assert when OSD destruct
0%
Description
coredump
(gdb) bt
#0 0x00007f506d3b623b in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
#1 0x0000563eafeebba6 in reraise_fatal (signum=6) at /usr/src/debug/ceph-12.2.10-469-g57b4c2d/src/global/signal_handler.cc:74
#2 handle_fatal_signal (signum=6) at /usr/src/debug/ceph-12.2.10-469-g57b4c2d/src/global/signal_handler.cc:138
#3 <signal handler called>
#4 0x00007f506c3e01d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#5 0x00007f506c3e18c8 in __GI_abort () at abort.c:90
#6 0x0000563eaff2ab14 in ceph::__ceph_assert_fail (assertion=assertion@entry=0x563eb04695b8 "(sharded_in_flight_list.back())->ops_in_flight_sharded.empty()",
file=file@entry=0x563eb04694e0 "/root/rpmbuild/BUILD/ceph-12.2.10-469-g57b4c2d/src/common/TrackedOp.cc", line=line@entry=153,
func=func@entry=0x563eb0469ee0 <OpTracker::~OpTracker()::__PRETTY_FUNCTION__> "OpTracker::~OpTracker()") at /usr/src/debug/ceph-12.2.10-469-g57b4c2d/src/common/assert.cc:66
#7 0x0000563eafbd2ee8 in OpTracker::~OpTracker (this=0x563eba8f3150, __in_chrg=<optimized out>) at /usr/src/debug/ceph-12.2.10-469-g57b4c2d/src/common/TrackedOp.cc:153
#8 0x0000563eaf981332 in OSD::~OSD (this=0x563eba8f2000, __in_chrg=<optimized out>) at /usr/src/debug/ceph-12.2.10-469-g57b4c2d/src/osd/OSD.cc:2042
#9 0x0000563eaf9815c9 in OSD::~OSD (this=0x563eba8f2000, __in_chrg=<optimized out>) at /usr/src/debug/ceph-12.2.10-469-g57b4c2d/src/osd/OSD.cc:2052
#10 0x0000563eaf870b68 in main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/ceph-12.2.10-469-g57b4c2d/src/ceph_osd.cc:675
when Optracker destruct, ops_in_flight_sharded is not empty
Updated by Greg Farnum about 5 years ago
Is this a custom build? Where did it come from?
Updated by bing lin about 5 years ago
Greg Farnum wrote:
Is this a custom build? Where did it come from?
aha,ceph version is Luminous 12.2.10,
see [[https://github.com/ceph/ceph/pull/26504]]
1. waiting_for_osdmap is list of OpRequestRef, so when push OpRequest to waiting_for_osdmap, OpRequest ++nref,and when take_waiter, Op will be push to finished from waiting_for_osdmap, until Op take from finished and nref put to zero,OpRequest will unregister from optracker.
2. when osd shutdown, finished will be clear, that will put Op nref(--nref), but when op has not been take to finished, Op will still in waiting_for_osdmap, however OpTracker destruct befor waiting_for_osdmap, so that will cause OpTracker::~OpTracker assert((sharded_in_flight_list.back())->ops_in_flight_sharded.empty());
timeline below:
t1 push Op to waiting_for_osdmap(++nref)
t2 osd shutdown
t3 delete osd
t4 delete OpTracker
t5 got assert
Updated by Greg Farnum about 5 years ago
- Status changed from New to Fix Under Review
- Pull request ID set to 26504
Updated by Sage Weil about 5 years ago
- Has duplicate Bug #38592: mon,osd: src/common/TrackedOp.cc: 163: FAILED ceph_assert((sharded_in_flight_list.back())->ops_in_flight_sharded.empty()) on shutdown added
Updated by Sage Weil about 5 years ago
- Has duplicate Bug #36546: common/TrackedOp.cc: 163: FAILED ceph_assert((sharded_in_flight_list.back())->ops_in_flight_sharded. empty()) added
Updated by Sage Weil about 5 years ago
- Status changed from Fix Under Review to Pending Backport
- Priority changed from Urgent to High
Updated by Nathan Cutler about 5 years ago
- Copied to Backport #38646: mimic: OpTracker destruct assert when OSD destruct added
Updated by Nathan Cutler about 5 years ago
- Backport changed from nautilus,mimic to mimic
master is still being merged into nautilus AFAICT
Updated by Nathan Cutler almost 5 years ago
- Status changed from Pending Backport to Resolved