Bug #13636
closedrbd: pure virtual method called
0%
Description
2015-10-26T01:48:55.719 INFO:tasks.workunit.client.0.mira071.stderr:+ rbd rm image.32 2015-10-26T01:48:55.778 INFO:tasks.workunit.client.0.mira071.stderr:pure virtual method called 2015-10-26T01:48:55.778 INFO:tasks.workunit.client.0.mira071.stderr:terminate called without an active exception 2015-10-26T01:48:55.778 INFO:tasks.workunit.client.0.mira071.stderr:*** Caught signal (Aborted) ** 2015-10-26T01:48:55.778 INFO:tasks.workunit.client.0.mira071.stderr: in thread 7f0c9a7fc700 2015-10-26T01:48:55.810 INFO:tasks.workunit.client.0.mira071.stderr: ceph version 9.1.0-313-g252ffe0 (252ffe0155b2f3b91aecc9eab5c785ad6f652dbe) 2015-10-26T01:48:55.811 INFO:tasks.workunit.client.0.mira071.stderr: 1: (()+0x1ab86a) [0x7f0cad50186a] 2015-10-26T01:48:55.811 INFO:tasks.workunit.client.0.mira071.stderr: 2: (()+0x10340) [0x7f0ca7b76340] 2015-10-26T01:48:55.811 INFO:tasks.workunit.client.0.mira071.stderr: 3: (gsignal()+0x39) [0x7f0ca6634cc9] 2015-10-26T01:48:55.811 INFO:tasks.workunit.client.0.mira071.stderr: 4: (abort()+0x148) [0x7f0ca66380d8] 2015-10-26T01:48:55.811 INFO:tasks.workunit.client.0.mira071.stderr: 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7f0ca6f3f535] 2015-10-26T01:48:55.811 INFO:tasks.workunit.client.0.mira071.stderr: 6: (()+0x5e6d6) [0x7f0ca6f3d6d6] 2015-10-26T01:48:55.811 INFO:tasks.workunit.client.0.mira071.stderr: 7: (()+0x5e703) [0x7f0ca6f3d703] 2015-10-26T01:48:55.812 INFO:tasks.workunit.client.0.mira071.stderr: 8: (()+0x5f1bf) [0x7f0ca6f3e1bf] 2015-10-26T01:48:55.812 INFO:tasks.workunit.client.0.mira071.stderr: 9: (()+0x79fe4) [0x7f0caa7acfe4] 2015-10-26T01:48:55.812 INFO:tasks.workunit.client.0.mira071.stderr: 10: (()+0x177e84) [0x7f0caa8aae84] 2015-10-26T01:48:55.812 INFO:tasks.workunit.client.0.mira071.stderr: 11: (()+0x179220) [0x7f0caa8ac220] 2015-10-26T01:48:55.812 INFO:tasks.workunit.client.0.mira071.stderr: 12: (()+0x8182) [0x7f0ca7b6e182] 2015-10-26T01:48:55.812 INFO:tasks.workunit.client.0.mira071.stderr: 13: (clone()+0x6d) [0x7f0ca66f847d] 2015-10-26T01:48:55.812 INFO:tasks.workunit.client.0.mira071.stderr:2015-10-26 01:48:55.790939 7f0c9a7fc700 -1 *** Caught signal (Aborted) ** 2015-10-26T01:48:55.813 INFO:tasks.workunit.client.0.mira071.stderr: in thread 7f0c9a7fc700
Updated by Jason Dillaman over 8 years ago
- Status changed from New to In Progress
- Assignee set to Jason Dillaman
Updated by Jason Dillaman over 8 years ago
#0 0x00007f0ca7b7620b in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37 #1 0x00007f0cad50192d in reraise_fatal (signum=6) at global/signal_handler.cc:59 #2 handle_fatal_signal (signum=6) at global/signal_handler.cc:109 #3 <signal handler called> #4 0x00007f0ca6634cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #5 0x00007f0ca66380d8 in __GI_abort () at abort.c:89 #6 0x00007f0ca6f3f535 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #7 0x00007f0ca6f3d6d6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #8 0x00007f0ca6f3d703 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #9 0x00007f0ca6f3e1bf in __cxa_pure_virtual () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #10 0x00007f0caa7acfe4 in ThreadPool::WorkQueueVal<std::pair<Context*, int>, std::pair<Context*, int> >::_void_dequeue ( this=0x7f0cb09d6360) at ./common/WorkQueue.h:197 #11 0x00007f0caa8aae84 in ThreadPool::worker (this=0x7f0cb09cfdf0, wt=0x7f0cb09c1970) at common/WorkQueue.cc:120 #12 0x00007f0caa8ac220 in ThreadPool::WorkThread::entry (this=<optimized out>) at common/WorkQueue.h:362 #13 0x00007f0ca7b6e182 in start_thread (arg=0x7f0c9a7fc700) at pthread_create.c:312 #14 0x00007f0ca66f847d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Updated by Jason Dillaman over 8 years ago
ThreadPool::WorkQueueVal @ 0x7f0cb09d6360 is the ImageCtx::op_work_queue from the active image. Thread 4 shows that the image is still open:
Thread 4 (Thread 0x7f0cad7597c0 (LWP 7008)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f0ca8251559 in Wait (mutex=..., this=0x7ffec3688ec0) at ./common/Cond.h:55 #2 librados::IoCtxImpl::operate_read (this=0x7f0cb09cf930, oid=..., o=<optimized out>, pbl=pbl@entry=0x7ffec36890c0, flags=flags@entry=0) at librados/IoCtxImpl.cc:560 #3 0x00007f0ca82187b6 in librados::IoCtx::operate (this=this@entry=0x7f0cb09cf1d0, oid=..., o=o@entry=0x7ffec3689030, pbl=pbl@entry=0x7ffec36890c0) at librados/librados.cc:1355 #4 0x00007f0caab224a3 in librbd::cls_client::get_stripe_unit_count (ioctx=ioctx@entry=0x7f0cb09cf1d0, oid=..., stripe_unit=stripe_unit@entry=0x7f0cb09cf5b0, stripe_count=stripe_count@entry=0x7f0cb09cf5b8) at cls/rbd/cls_rbd_client.cc:574 #5 0x00007f0caa7abbaf in librbd::ImageCtx::init (this=this@entry=0x7f0cb09cf050) at librbd/ImageCtx.cc:174 #6 0x00007f0caa7d7bfa in librbd::open_image (ictx=ictx@entry=0x7f0cb09cf050) at librbd/internal.cc:2679 #7 0x00007f0caa7d98a7 in librbd::remove (io_ctx=..., imgname=imgname@entry=0x7f0cb09c0cc0 "image.32", prog_ctx=...) at librbd/internal.cc:1813 #8 0x00007f0caa7805f0 in librbd::RBD::remove_with_progress (this=<optimized out>, io_ctx=..., name=0x7f0cb09c0cc0 "image.32", pctx=...) at librbd/librbd.cc:346 #9 0x00007f0cad4215c0 in do_delete (rbd=..., io_ctx=..., imgname=<optimized out>) at rbd.cc:691 #10 0x00007f0cad42f7ea in main (argc=<optimized out>, argv=<optimized out>) at rbd.cc:3796
Dumping the vtable from the aborted thread indicates that the vtable is accurate for the "_empty" method at the time of core generation:
p /a (*(void ***)this)[0]@10 $11 = {0x7f0caa7ade40 <ContextWQ::~ContextWQ()>, 0x7f0caa7ad910 <ContextWQ::~ContextWQ()>, 0x7f0caa7acb10 <ThreadPool::WorkQueueVal<std::pair<Context*, int>, std::pair<Context*, int> >::_clear()>, 0x7f0caa7acaa0 <ContextWQ::_empty()>, 0x7f0caa7acfc0 <ThreadPool::WorkQueueVal<std::pair<Context*, int>, std::pair<Context*, int> >::_void_dequeue()>, 0x7f0caa7acde0 <ThreadPool::WorkQueueVal<std::pair<Context*, int>, std::pair<Context*, int> >::_void_process(void*, ThreadPool::TPHandle&)>, 0x7f0caa7acd50 <ThreadPool::WorkQueueVal<std::pair<Context*, int>, std::pair<Context*, int> >::_void_process_finish(void*)>, 0x7f0caa7aced0 <ContextWQ::_enqueue(std::pair<Context*, int>)>, 0x7f0caa7acf20 <ContextWQ::_enqueue_front(std::pair<Context*, int>)>, 0x7f0caa7acb90 <ContextWQ::_dequeue()>}
Therefore, the vtable must not have been initialized when it was registered with the ThreadPool. Turns out that the parent class of ContextWQ improperly adds itself to the ThreadPool within its constructor -- resulting in a race.
Updated by Jason Dillaman over 8 years ago
- Backport set to infernalis,hammer,firefly
Updated by Nathan Cutler over 8 years ago
Hi - if you target infernalis with the fix, we would only have to backport to hammer and firefly. AFAIK master is still being regularly synched with infernalis. Just an idea.
Updated by Jason Dillaman over 8 years ago
infernalis is effectively closed pending its eminent release. After its release, a jewel branch will be created from master and all bug fixes should be targeted to it.
Updated by Jason Dillaman over 8 years ago
- Status changed from In Progress to Fix Under Review
jewel PR: https://github.com/ceph/ceph/pull/6525
Updated by Loïc Dachary over 8 years ago
- Copied to Backport #13757: rbd: pure virtual method called added
Updated by Loïc Dachary over 8 years ago
- Copied to Backport #13758: rbd: pure virtual method called added
Updated by Loïc Dachary over 8 years ago
- Copied to Backport #13759: rbd: pure virtual method called added
Updated by Jason Dillaman over 8 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Loïc Dachary about 8 years ago
- Status changed from Pending Backport to Resolved