Bug #8885
closedSIGABRT in TrackedOp::dump() via dump_ops_in_flight()
0%
Description
ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-07-19_23:15:03-rados-wip-sage-testing-testing-basic-plana/370540
-2631> 2014-07-20 03:37:26.447023 7f7ff5f15700 -1 *** Caught signal (Aborted) ** in thread 7f7ff5f15700 ceph version 0.82-697-g4438ab1 (4438ab1b7ea41df6e18e9cd17d9a69f34c925c83) 1: ceph-osd() [0x9aa88a] 2: (()+0xfcb0) [0x7f7ff8c25cb0] 3: (gsignal()+0x35) [0x7f7ff736d425] 4: (abort()+0x17b) [0x7f7ff7370b8b] 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f7ff7cc069d] 6: (()+0xb5846) [0x7f7ff7cbe846] 7: (()+0xb5873) [0x7f7ff7cbe873] 8: (()+0xb628f) [0x7f7ff7cbf28f] 9: (TrackedOp::dump(utime_t, ceph::Formatter*) const+0x1d0) [0x6cfd60] 10: (OpTracker::dump_ops_in_flight(ceph::Formatter*)+0xa7) [0x6d07e7] 11: (OSD::asok_command(std::string, std::map<std::string, boost::variant<std::string, bool, long, double, std::vector<std::string, std::allocator<std::string> >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::varia nt::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::va riant::void_, boost::detail::variant::void_, boost::detail::variant::void_>, std::less<std::string>, std::allocator<std::pair<std::string const, boost::variant<std::string, bool, long, double, std::vector<std::string, std::allocator<std::string> >, boost::detail::variant::voi d_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant:: void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> > > >&, std::string, std::ostream&)+0x2f0) [0x6262e0] 12: (OSDSocketHook::call(std::string, std::map<std::string, boost::variant<std::string, bool, long, double, std::vector<std::string, std::allocator<std::string> >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::var iant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail:: variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>, std::less<std::string>, std::allocator<std::pair<std::string const, boost::variant<std::string, bool, long, double, std::vector<std::string, std::allocator<std::string> >, boost::detail::variant::v oid_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant ::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> > > >&, std::string, ceph::buffer::list&)+0x7d) [0x68b90d] 13: (AdminSocket::do_accept()+0x1084) [0xa88674] 14: (AdminSocket::entry()+0x130) [0xa89d70] 15: (()+0x7e9a) [0x7f7ff8c1de9a] 16: (clone()+0x6d) [0x7f7ff742b3fd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
this is the second time i've seen this
Updated by Greg Farnum almost 10 years ago
- Assignee changed from Greg Farnum to Sage Weil
Have you seen this on a mainstream branch?
Looking at the dump() function, the only plausible candidate for failing is that the OpRequest::_dump_op_descriptor() function calls the Message::print() function. You've got a lot of Messenger and OSD changes running around which could corrupt that. ;)
If it is on mainline, I guess we are clearing the Message's payload and data when the Op is unregistered, but that's not the scenario we're in here (dump_ops_in_flight) and I don't think any Messages depend on that for their printing?
Updated by Sage Weil over 9 years ago
- Status changed from 12 to Need More Info
- Assignee deleted (
Sage Weil)
let's wait for this to trigger on master or next
Updated by Sage Weil over 9 years ago
- Status changed from Need More Info to Can't reproduce
Updated by Sage Weil over 9 years ago
- Status changed from Can't reproduce to 12
on master this time: description: rados/thrash/{clusters/fixed-2.yaml fs/btrfs.yaml msgr-failures/few.yaml
thrashers/default.yaml workloads/ec-rados-plugin=jerasure-k=3-m=1.yaml}
ubuntu@teuthology:/a/sage-2014-08-30_20:39:25-rados-wip-sage-testing-testing-basic-multi/462360
2014-08-30 22:16:23.688881 7f520521c700 -1 *** Caught signal (Aborted) ** in thread 7f520521c700 ceph version 0.84-933-g4c4f35e (4c4f35e9099313ca5f86dec4f1ea045513d9648b) 1: ceph-osd() [0x9b5b6a] 2: (()+0xfcb0) [0x7f5208d0dcb0] 3: (gsignal()+0x35) [0x7f52075f9425] 4: (abort()+0x17b) [0x7f52075fcb8b] 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f5207f4c69d] 6: (()+0xb5846) [0x7f5207f4a846] 7: (()+0xb5873) [0x7f5207f4a873] 8: (()+0xb628f) [0x7f5207f4b28f] 9: (TrackedOp::dump(utime_t, ceph::Formatter*) const+0x1d0) [0x6c8520] 10: (OpTracker::dump_ops_in_flight(ceph::Formatter*)+0xaf) [0x6c8f6f] 11: (OSD::asok_command(std::string, std::map<std::string, boost::variant<std::string, bool, long, double, std::vector<std::string, std::allocator<std::string> >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>, std::less<std::string>, std::allocator<std::pair<std::string const, boost::variant<std::string, bool, long, double, std::vector<std::string, std::allocator<std::string> >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> > > >&, std::string, std::ostream&)+0x300) [0x61e950] 12: (OSDSocketHook::call(std::string, std::map<std::string, boost::variant<std::string, bool, long, double, std::vector<std::string, std::allocator<std::string> >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>, std::less<std::string>, std::allocator<std::pair<std::string const, boost::variant<std::string, bool, long, double, std::vector<std::string, std::allocator<std::string> >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> > > >&, std::string, ceph::buffer::list&)+0x7d) [0x67fcdd] 13: (AdminSocket::do_accept()+0x109a) [0xa95f2a] 14: (AdminSocket::entry()+0x130) [0xa97550] 15: (()+0x7e9a) [0x7f5208d05e9a] 16: (clone()+0x6d) [0x7f52076b73fd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Updated by Somnath Roy over 9 years ago
It's the same issue as #9384
Here is the pull request for the same.
Updated by Samuel Just over 9 years ago
- Status changed from Resolved to 12
- Priority changed from High to Urgent
/a/samuelj-2014-10-21_16:45:57-rados-wip-sam-testing-wip-testing-vanilla-fixes-basic-multi/564093/remote
Updated by Samuel Just over 9 years ago
- Status changed from 12 to Can't reproduce
Updated by Brad Hubbard over 8 years ago
Seeing this on 0.94.3.3
ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b) 1: /usr/bin/ceph-osd() [0xa20592] 2: (()+0xf100) [0x7f4f5e2e4100] 3: (gsignal()+0x37) [0x7f4f5ccfc5f7] 4: (abort()+0x148) [0x7f4f5ccfdce8] 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f4f5d6009d5] 6: (()+0x5e946) [0x7f4f5d5fe946] 7: (()+0x5e973) [0x7f4f5d5fe973] 8: (()+0x5f4df) [0x7f4f5d5ff4df] 9: (TrackedOp::dump(utime_t, ceph::Formatter*) const+0x238) [0x6e8278] 10: (OpTracker::dump_ops_in_flight(ceph::Formatter*)+0xa7) [0x6e8d17] 11: (OSD::asok_command(std::string, std::map<std::string, boost::variant<std::string, bool, long, double, std::vector<std::string, std::allocator<std::string> >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>, std::less<std::string>, std::allocator<std::pair<std::string const, boost::variant<std::string, bool, long, double, std::vector<std::string, std::allocator<std::string> >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> > > >&, std::string, std::ostream&)+0x216) [0x63bb96] 12: (OSDSocketHook::call(std::string, std::map<std::string, boost::variant<std::string, bool, long, double, std::vector<std::string, std::allocator<std::string> >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>, std::less<std::string>, std::allocator<std::pair<std::string const, boost::variant<std::string, bool, long, double, std::vector<std::string, std::allocator<std::string> >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> > > >&, std::string, ceph::buffer::list&)+0x25b) [0x6a764b] 13: (AdminSocket::do_accept()+0xf36) [0xb13e36] 14: (AdminSocket::entry()+0x280) [0xb15960] 15: (()+0x7dc5) [0x7f4f5e2dcdc5] 16: (clone()+0x6d) [0x7f4f5cdbd21d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. $ git describe --contains a8e6de307f2e4f4ff5e7faad3809ed708e05fdde v0.88~171 # gdb -q /usr/bin/ceph-osd Reading symbols from /usr/bin/ceph-osd...Reading symbols from /usr/lib/debug/usr/bin/ceph-osd.debug...done. done. (gdb) disass /m 0x6e8278 Dump of assembler code for function TrackedOp::dump(utime_t, ceph::Formatter*) const: 322 { 0x00000000006e8040 <+0>: push %r15 0x00000000006e8042 <+2>: push %r14 0x00000000006e8044 <+4>: push %r13 0x00000000006e8046 <+6>: push %r12 0x00000000006e8048 <+8>: mov %rdi,%r12 0x00000000006e804b <+11>: push %rbp 0x00000000006e804c <+12>: mov %rdx,%rbp 0x00000000006e804f <+15>: push %rbx 0x00000000006e8050 <+16>: sub $0x248,%rsp 0x00000000006e8057 <+23>: mov %fs:0x28,%rax 0x00000000006e8060 <+32>: mov %rax,0x238(%rsp) 0x00000000006e8068 <+40>: xor %eax,%eax 0x00000000006e807a <+58>: mov %rsi,0x18(%rsp) 0x00000000006e807f <+63>: mov %esi,0x30(%rsp) 323 stringstream name; 324 _dump_op_descriptor_unlocked(name); 0x00000000006e8265 <+549>: mov 0x10(%rsp),%rcx 0x00000000006e826a <+554>: mov (%r12),%rax 0x00000000006e826e <+558>: mov %r12,%rdi 0x00000000006e8271 <+561>: lea 0x10(%rcx),%rsi 0x00000000006e8275 <+565>: callq *0x10(%rax) 325 f->dump_string("description", name.str().c_str()); // this TrackedOp 0x00000000006e8278 <+568>: mov 0x0(%rbp),%rax <----------HERE Breakpoint 1, TrackedOp::dump (this=0x623fc00, now=now@entry=..., f=f@entry=0x5614680) at common/TrackedOp.cc:322 322 { (gdb) n ... (gdb) n 325 f->dump_string("description", name.str().c_str()); // this TrackedOp (gdb) i r rbp rbp 0x5614680 0x5614680 (gdb) p f $4 = (ceph::Formatter *) 0x5614680
So this is a problem with the formatter being invalid IMHO.
Updated by Samuel Just about 8 years ago
- Status changed from Can't reproduce to New
- Assignee changed from Somnath Roy to David Zafman
David, can you take a look?
Updated by David Zafman about 8 years ago
I wasn't able to reproduce.
Based on the location of what I assume is a segfault the value of register bp was corrupted when getting popped off the stack coming back from OpRequest::_dump_op_descriptor_unlocked().
Another way to look at the latest stack is that we actually crashed while still still in _dump_op_descriptor_unlocked() which peeks at the value of Message * without holding locks. If a Message were to get reused and some random Message * was attempting to call print() that could cause the problem Maybe gdb is unable to interpret the stack there so it isn't showing the other function names in frames 6, 7 and 8.
5: (_gnu_cxx::_verbose_terminate_handler()+0x165) [0x7f4f5d6009d5]
6: (()+0x5e946) [0x7f4f5d5fe946]
7: (()+0x5e973) [0x7f4f5d5fe973]
8: (()+0x5f4df) [0x7f4f5d5ff4df]
9: (TrackedOp::dump(utime_t, ceph::Formatter*) const+0x238) [0x6e8278]
Updated by David Zafman about 8 years ago
- Status changed from New to In Progress
I have been able to reproduce and found an additional locking issue. Still trying to find and fix the issue described here.
Updated by David Zafman about 8 years ago
#0 0x00007f5353fe3cc9 in _GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f5353fe70d8 in __GI_abort () at abort.c:89
#2 0x00007f53548ee535 in __gnu_cxx::_verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007f53548ec6d6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007f53548ec703 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007f53548ed1bf in __cxa_pure_virtual () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007f535821525d in TrackedOp::dump (this=0x7f52fc1aafe0, now=..., now@entry=..., f=f@entry=0x7f5344000e50) at common/TrackedOp.cc:319
#7 0x00007f5358216410 in OpTracker::dump_ops_in_flight (this=0x7f5361c68638, f=0x7f5344000e50, print_only_blocked=false) at common/TrackedOp.cc:110
#8 0x00007f5357c36811 in OSD::asok_command (this=0x7f5361c675d0, command="dump_ops_in_flight", cmdmap=std::map with 1 elements = {...}, format="json-pretty", ss=...) at osd/OSD.cc:1715
#9 0x00007f5357ca50e8 in OSDSocketHook::call (this=0x7f5361cba1d0, command="dump_ops_in_flight", cmdmap=std::map with 1 elements = {...}, format=..., out=...) at osd/OSD.cc:1683
#10 0x00007f535833f254 in AdminSocket::do_accept (this=0x7f5361bdc040) at common/admin_socket.cc:399
#11 0x00007f535833e358 in AdminSocket::entry (this=0x7f5361bdc040) at common/admin_socket.cc:280
#12 0x00007f5358329096 in Thread::entry_wrapper (this=0x7f5361bdc040) at common/Thread.cc:84
#13 0x00007f5358328fe4 in Thread::_entry_func (arg=0x7f5361bdc040) at common/Thread.cc:66
#14 0x00007f5355f79182 in start_thread (arg=0x7f5351ff6700) at pthread_create.c:312
#15 0x00007f53540a747d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
This shows that a racing unregister for 0x7f52fc1aafe0 which is waiting correctly behind a lock. However, at this point the Message print() method can't be called because presumably the destructor has been called.
#0 lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1 0x00007f5355f7b657 in L_lock_909 () from /lib/x86_64-linux-gnu/libpthread.so.0
#2 0x00007f5355f7b480 in __GI_pthread_mutex_lock (mutex=0x7f5361c6f4a0) at ../nptl/pthread_mutex_lock.c:79
#3 0x00007f53582f3bb8 in Mutex::Lock (this=0x7f5361c6f490, no_lockdep=false) at common/Mutex.cc:110
#4 0x00007f5358212335 in Locker (m=..., this=<synthetic pointer>) at ./common/Mutex.h:115
#5 OpTracker::unregister_inflight_op (this=0x7f5361c68638, i=i@entry=0x7f52fc1aafe0) at common/TrackedOp.cc:151
#6 0x00007f53582127c5 in OpTracker::RemoveOnDelete::operator() (this=0x7f52fc16e978, op=0x7f52fc1aafe0) at common/TrackedOp.cc:297
#7 0x00007f5357c20ce9 in std::_Sp_counted_base<(_gnu_cxx::_Lock_policy)2>::_M_release (this=0x7f52fc16e960) at /usr/include/c++/4.8/bits/shared_ptr_base.h:144
#8 0x00007f5357c91946 in ~_shared_count (this=0x7f52b8ff7968, __in_chrg=<optimized out>) at /usr/include/c++/4.8/bits/shared_ptr_base.h:546
#9 ~_shared_ptr (this=0x7f52b8ff7960, _in_chrg=<optimized out>) at /usr/include/c++/4.8/bits/shared_ptr_base.h:781
#10 ~shared_ptr (this=0x7f52b8ff7960, __in_chrg=<optimized out>) at /usr/include/c++/4.8/bits/shared_ptr.h:93
#11 internal_visit<std::shared_ptr<OpRequest> > (this=<optimized out>, operand=std::shared_ptr (expired, weak 1) 0x7f52fc1aafe0) at /usr/include/boost/variant/variant.hpp:315
#12 visitation_impl_invoke_impl<boost::detail::variant::destroyer, void*, std::shared_ptr<OpRequest> > (visitor=..., storage=0x7f52b8ff7960) at /usr/include/boost/variant/detail/visitation_impl.hpp:130
#13 visitation_impl_invoke<boost::detail::variant::destroyer, void*, std::shared_ptr<OpRequest>, boost::variant<std::shared_ptr<OpRequest>, PGSnapTrim, PGScrub>::has_fallback_type> (internal_which=<optimized out>, visitor=..., t=0x0, storage=0x7f52b8ff7960) at /usr/include/boost/variant/detail/visitation_impl.hpp:173
#14 visitation_impl<mpl_::int_<0>, boost::detail::variant::visitation_impl_step<boost::mpl::l_iter<boost::mpl::l_item<mpl_::long_<3l>, std::shared_ptr<OpRequest>, boost::mpl::l_item<mpl_::long_<2l>, PGSnapTrim, boost::mpl::l_item<mpl_::long_<1l>, PGScrub, boost::mpl::l_end> > > >, boost::mpl::l_iter<boost::mpl::l_end> >, boost::detail::variant::destroyer, void*, boost::variant<std::shared_ptr<OpRequest>, PGSnapTrim, PGScrub>::has_fallback_type_> (no_backup_flag=..., storage=0x7f52b8ff7960, visitor=<synthetic pointer>, logical_which=<optimized out>, internal_which=<optimized out>) at /usr/include/boost/variant/detail/visitation_impl.hpp:256
#15 internal_apply_visitor_impl<boost::detail::variant::destroyer, void*> (storage=0x7f52b8ff7960, visitor=<synthetic pointer>, logical_which=<optimized out>, internal_which=<optimized out>) at /usr/include/boost/variant/variant.hpp:2326
#16 internal_apply_visitor<boost::detail::variant::destroyer> (visitor=<synthetic pointer>, this=0x7f52b8ff7958) at /usr/include/boost/variant/variant.hpp:2337
#17 destroy_content (this=0x7f52b8ff7958) at /usr/include/boost/variant/variant.hpp:1373
#18 ~variant (this=0x7f52b8ff7958, __in_chrg=<optimized out>) at /usr/include/boost/variant/variant.hpp:1380
#19 ~PGQueueable (this=0x7f52b8ff7958, __in_chrg=<optimized out>) at osd/OSD.h:347
#20 std::pair<boost::intrusive_ptr<PG>, PGQueueable>::~pair (this=0x7f52b8ff7950, __in_chrg=<optimized out>) at /usr/include/c++/4.8/bits/stl_pair.h:96
#21 0x00007f5357c57288 in OSD::ShardedOpWQ::_process (this=0x7f5361c687b0, thread_index=<optimized out>, hb=<optimized out>) at osd/OSD.cc:8529
#22 0x00007f53583380dd in ShardedThreadPool::shardedthreadpool_worker (this=0x7f5361c67d38, thread_index=7) at common/WorkQueue.cc:340
#23 0x00007f5358339a9f in ShardedThreadPool::WorkThreadSharded::entry (this=0x7f5361c962b0) at common/WorkQueue.h:679
#24 0x00007f5358329096 in Thread::entry_wrapper (this=0x7f5361c962b0) at common/Thread.cc:84
#25 0x00007f5358328fe4 in Thread::_entry_func (arg=0x7f5361c962b0) at common/Thread.cc:66
#26 0x00007f5355f79182 in start_thread (arg=0x7f52b8ff9700) at pthread_create.c:312
#27 0x00007f53540a747d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Updated by David Zafman about 8 years ago
Found it! It isn't a destructor race, but rather a register_inflight_op() being called out of the OpRequest constructor. After the register_inflight_op() completes, but before the constructor OpRequest::OpRequest() finishes, a dump won't be able to call the virtual function OpRequest:: _dump_op_descriptor_unlocked().
This is a breakpoint on OpTracker::register_inflight_op() just to show the how things are getting constructed.
#0 OpTracker::register_inflight_op (this=0x7f7bd8831d98, i=i@entry=0x7f7b7402cac8) at common/TrackedOp.cc:125
#1 0x00007f7bce3bcb3f in TrackedOp (initiated=..., _tracker=0x7f7bd8831d98, this=0x7f7b7402cac0) at ./common/TrackedOp.h:173
#2 OpRequest::OpRequest (this=0x7f7b7402cac0, req=0x7f7b74354090, tracker=0x7f7bd8831d98) at osd/OpRequest.cc:31
#3 0x00007f7bce34744f in create_request<OpRequest, Message*> (params=0x7f7b74354090, this=0x7f7bd8831d98) at ./common/TrackedOp.h:141
#4 OSD::ms_fast_dispatch (this=0x7f7bd8830d30, m=0x7f7b74354090) at osd/OSD.cc:5776
#5 0x00007f7bcea3d226 in ms_fast_dispatch (m=0x7f7b74354090, this=0x7f7bd87d66f0) at ./msg/Messenger.h:555
#6 DispatchQueue::fast_dispatch (this=0x7f7bd87d68b8, m=0x7f7b74354090) at msg/simple/DispatchQueue.cc:71
#7 0x00007f7bcea6cf59 in Pipe::reader (this=0x7f7bbc00a140) at msg/simple/Pipe.cc:1650
#8 0x00007f7bcea7460d in Pipe::Reader::entry (this=<optimized out>) at msg/simple/Pipe.h:52
#9 0x00007f7bccec9182 in start_thread (arg=0x7f7b28aed700) at pthread_create.c:312
#10 0x00007f7bcaff747d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Updated by David Zafman about 8 years ago
- Status changed from In Progress to Fix Under Review
Updated by Sage Weil about 8 years ago
- Related to Bug #14598: osd: pure virtual method called, abrt in TrackedOp::dump added
Updated by Sage Weil about 8 years ago
- Status changed from Fix Under Review to Resolved
Updated by David Zafman almost 8 years ago
- Status changed from Resolved to Pending Backport
- Backport set to hammer
Updated by David Zafman almost 8 years ago
- Copied to Backport #16225: hammer: SIGABRT in TrackedOp::dump() via dump_ops_in_flight() added
Updated by Nathan Cutler over 7 years ago
- Status changed from Pending Backport to Resolved