Project

General

Profile

Actions

Bug #13636

closed

rbd: pure virtual method called

Added by Josh Durgin over 8 years ago. Updated about 8 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Jason Dillaman
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
infernalis,hammer,firefly
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

From http://qa-proxy.ceph.com/teuthology/teuthology-2015-10-23_23:00:08-rbd-master---basic-multi/1122358/teuthology.log :

2015-10-26T01:48:55.719 INFO:tasks.workunit.client.0.mira071.stderr:+ rbd rm image.32
2015-10-26T01:48:55.778 INFO:tasks.workunit.client.0.mira071.stderr:pure virtual method called
2015-10-26T01:48:55.778 INFO:tasks.workunit.client.0.mira071.stderr:terminate called without an active exception
2015-10-26T01:48:55.778 INFO:tasks.workunit.client.0.mira071.stderr:*** Caught signal (Aborted) **
2015-10-26T01:48:55.778 INFO:tasks.workunit.client.0.mira071.stderr: in thread 7f0c9a7fc700
2015-10-26T01:48:55.810 INFO:tasks.workunit.client.0.mira071.stderr: ceph version 9.1.0-313-g252ffe0 (252ffe0155b2f3b91aecc9eab5c785ad6f652dbe)
2015-10-26T01:48:55.811 INFO:tasks.workunit.client.0.mira071.stderr: 1: (()+0x1ab86a) [0x7f0cad50186a]
2015-10-26T01:48:55.811 INFO:tasks.workunit.client.0.mira071.stderr: 2: (()+0x10340) [0x7f0ca7b76340]
2015-10-26T01:48:55.811 INFO:tasks.workunit.client.0.mira071.stderr: 3: (gsignal()+0x39) [0x7f0ca6634cc9]
2015-10-26T01:48:55.811 INFO:tasks.workunit.client.0.mira071.stderr: 4: (abort()+0x148) [0x7f0ca66380d8]
2015-10-26T01:48:55.811 INFO:tasks.workunit.client.0.mira071.stderr: 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7f0ca6f3f535]
2015-10-26T01:48:55.811 INFO:tasks.workunit.client.0.mira071.stderr: 6: (()+0x5e6d6) [0x7f0ca6f3d6d6]
2015-10-26T01:48:55.811 INFO:tasks.workunit.client.0.mira071.stderr: 7: (()+0x5e703) [0x7f0ca6f3d703]
2015-10-26T01:48:55.812 INFO:tasks.workunit.client.0.mira071.stderr: 8: (()+0x5f1bf) [0x7f0ca6f3e1bf]
2015-10-26T01:48:55.812 INFO:tasks.workunit.client.0.mira071.stderr: 9: (()+0x79fe4) [0x7f0caa7acfe4]
2015-10-26T01:48:55.812 INFO:tasks.workunit.client.0.mira071.stderr: 10: (()+0x177e84) [0x7f0caa8aae84]
2015-10-26T01:48:55.812 INFO:tasks.workunit.client.0.mira071.stderr: 11: (()+0x179220) [0x7f0caa8ac220]
2015-10-26T01:48:55.812 INFO:tasks.workunit.client.0.mira071.stderr: 12: (()+0x8182) [0x7f0ca7b6e182]
2015-10-26T01:48:55.812 INFO:tasks.workunit.client.0.mira071.stderr: 13: (clone()+0x6d) [0x7f0ca66f847d]
2015-10-26T01:48:55.812 INFO:tasks.workunit.client.0.mira071.stderr:2015-10-26 01:48:55.790939 7f0c9a7fc700 -1 *** Caught signal (Aborted) **
2015-10-26T01:48:55.813 INFO:tasks.workunit.client.0.mira071.stderr: in thread 7f0c9a7fc700

Related issues 3 (0 open3 closed)

Copied to rbd - Backport #13757: rbd: pure virtual method calledRejectedActions
Copied to rbd - Backport #13758: rbd: pure virtual method calledResolvedAbhishek LekshmananActions
Copied to rbd - Backport #13759: rbd: pure virtual method calledResolvedAbhishek VarshneyActions
Actions #1

Updated by Jason Dillaman over 8 years ago

  • Status changed from New to In Progress
  • Assignee set to Jason Dillaman
Actions #2

Updated by Jason Dillaman over 8 years ago

#0  0x00007f0ca7b7620b in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
#1  0x00007f0cad50192d in reraise_fatal (signum=6) at global/signal_handler.cc:59
#2  handle_fatal_signal (signum=6) at global/signal_handler.cc:109
#3  <signal handler called>
#4  0x00007f0ca6634cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#5  0x00007f0ca66380d8 in __GI_abort () at abort.c:89
#6  0x00007f0ca6f3f535 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007f0ca6f3d6d6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#8  0x00007f0ca6f3d703 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#9  0x00007f0ca6f3e1bf in __cxa_pure_virtual () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#10 0x00007f0caa7acfe4 in ThreadPool::WorkQueueVal<std::pair<Context*, int>, std::pair<Context*, int> >::_void_dequeue (
    this=0x7f0cb09d6360) at ./common/WorkQueue.h:197
#11 0x00007f0caa8aae84 in ThreadPool::worker (this=0x7f0cb09cfdf0, wt=0x7f0cb09c1970) at common/WorkQueue.cc:120
#12 0x00007f0caa8ac220 in ThreadPool::WorkThread::entry (this=<optimized out>) at common/WorkQueue.h:362
#13 0x00007f0ca7b6e182 in start_thread (arg=0x7f0c9a7fc700) at pthread_create.c:312
#14 0x00007f0ca66f847d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Actions #3

Updated by Jason Dillaman over 8 years ago

ThreadPool::WorkQueueVal @ 0x7f0cb09d6360 is the ImageCtx::op_work_queue from the active image. Thread 4 shows that the image is still open:

Thread 4 (Thread 0x7f0cad7597c0 (LWP 7008)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f0ca8251559 in Wait (mutex=..., this=0x7ffec3688ec0) at ./common/Cond.h:55
#2  librados::IoCtxImpl::operate_read (this=0x7f0cb09cf930, oid=..., o=<optimized out>, pbl=pbl@entry=0x7ffec36890c0, flags=flags@entry=0)
    at librados/IoCtxImpl.cc:560
#3  0x00007f0ca82187b6 in librados::IoCtx::operate (this=this@entry=0x7f0cb09cf1d0, oid=..., o=o@entry=0x7ffec3689030, 
    pbl=pbl@entry=0x7ffec36890c0) at librados/librados.cc:1355
#4  0x00007f0caab224a3 in librbd::cls_client::get_stripe_unit_count (ioctx=ioctx@entry=0x7f0cb09cf1d0, oid=..., 
    stripe_unit=stripe_unit@entry=0x7f0cb09cf5b0, stripe_count=stripe_count@entry=0x7f0cb09cf5b8) at cls/rbd/cls_rbd_client.cc:574
#5  0x00007f0caa7abbaf in librbd::ImageCtx::init (this=this@entry=0x7f0cb09cf050) at librbd/ImageCtx.cc:174
#6  0x00007f0caa7d7bfa in librbd::open_image (ictx=ictx@entry=0x7f0cb09cf050) at librbd/internal.cc:2679
#7  0x00007f0caa7d98a7 in librbd::remove (io_ctx=..., imgname=imgname@entry=0x7f0cb09c0cc0 "image.32", prog_ctx=...)
    at librbd/internal.cc:1813
#8  0x00007f0caa7805f0 in librbd::RBD::remove_with_progress (this=<optimized out>, io_ctx=..., name=0x7f0cb09c0cc0 "image.32", pctx=...)
    at librbd/librbd.cc:346
#9  0x00007f0cad4215c0 in do_delete (rbd=..., io_ctx=..., imgname=<optimized out>) at rbd.cc:691
#10 0x00007f0cad42f7ea in main (argc=<optimized out>, argv=<optimized out>) at rbd.cc:3796

Dumping the vtable from the aborted thread indicates that the vtable is accurate for the "_empty" method at the time of core generation:

p /a (*(void ***)this)[0]@10
$11 = {0x7f0caa7ade40 <ContextWQ::~ContextWQ()>, 0x7f0caa7ad910 <ContextWQ::~ContextWQ()>, 
  0x7f0caa7acb10 <ThreadPool::WorkQueueVal<std::pair<Context*, int>, std::pair<Context*, int> >::_clear()>, 
  0x7f0caa7acaa0 <ContextWQ::_empty()>, 
  0x7f0caa7acfc0 <ThreadPool::WorkQueueVal<std::pair<Context*, int>, std::pair<Context*, int> >::_void_dequeue()>, 
  0x7f0caa7acde0 <ThreadPool::WorkQueueVal<std::pair<Context*, int>, std::pair<Context*, int> >::_void_process(void*, ThreadPool::TPHandle&)>, 0x7f0caa7acd50 <ThreadPool::WorkQueueVal<std::pair<Context*, int>, std::pair<Context*, int> >::_void_process_finish(void*)>, 
  0x7f0caa7aced0 <ContextWQ::_enqueue(std::pair<Context*, int>)>, 0x7f0caa7acf20 <ContextWQ::_enqueue_front(std::pair<Context*, int>)>, 
  0x7f0caa7acb90 <ContextWQ::_dequeue()>}

Therefore, the vtable must not have been initialized when it was registered with the ThreadPool. Turns out that the parent class of ContextWQ improperly adds itself to the ThreadPool within its constructor -- resulting in a race.

Actions #4

Updated by Jason Dillaman over 8 years ago

  • Backport set to infernalis,hammer,firefly
Actions #5

Updated by Nathan Cutler over 8 years ago

Hi - if you target infernalis with the fix, we would only have to backport to hammer and firefly. AFAIK master is still being regularly synched with infernalis. Just an idea.

Actions #6

Updated by Jason Dillaman over 8 years ago

infernalis is effectively closed pending its eminent release. After its release, a jewel branch will be created from master and all bug fixes should be targeted to it.

Actions #7

Updated by Jason Dillaman over 8 years ago

  • Status changed from In Progress to Fix Under Review
Actions #8

Updated by Loïc Dachary over 8 years ago

Actions #9

Updated by Loïc Dachary over 8 years ago

Actions #10

Updated by Loïc Dachary over 8 years ago

Actions #11

Updated by Jason Dillaman over 8 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #12

Updated by Loïc Dachary about 8 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF