Project

General

Profile

Actions

Bug #16623

closed

segfault in unittest_rbd_mirror

Added by Brad Hubbard almost 8 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Jason Dillaman
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

@Core was generated by `/home/brad/working/src/ceph/build/bin/unittest_rbd_mirror'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 testing::internal::ActionResultHolder<void>::GetValueAndDelete (this=0x0) at /home/brad/working/src/ceph/src/gmock/include/gmock/gmock-spec-builders.h:1373
1373 void GetValueAndDelete() const { delete this; }

(gdb) info threads
Id Target Id Frame
  • 1 Thread 0x7f73cdada740 (LWP 22435) testing::internal::ActionResultHolder<void>::GetValueAndDelete (this=0x0) at /home/brad/working/src/ceph/src/gmock/include/gmock/gmock-spec-builders.h:1373
    2 Thread 0x7f73be5a2700 (LWP 22436) pthread_cond_wait@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
    3 Thread 0x7f73b4d8f700 (LWP 22471) testing::internal::ActionResultHolder<void>::GetValueAndDelete (this=0x0) at /home/brad/working/src/ceph/src/gmock/include/gmock/gmock-spec-builders.h:1373
    4 Thread 0x7f73ba59a700 (LWP 22444) pthread_cond_wait
    @GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
    5 Thread 0x7f73bdda1700 (LWP 22437) pthread_cond_wait@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
    6 Thread 0x7f73b7594700 (LWP 22470) pthread_cond_wait
    @GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
    7 Thread 0x7f73ae385700 (LWP 22460) pthread_cond_wait@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
    8 Thread 0x7f73bd5a0700 (LWP 22438) pthread_cond_wait
    @GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
    9 Thread 0x7f73b9598700 (LWP 22466) pthread_cond_wait@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
    10 Thread 0x7f73bcd9f700 (LWP 22439) pthread_cond_wait
    @GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
    11 Thread 0x7f73b8d97700 (LWP 22467) pthread_cond_wait@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
    12 Thread 0x7f73b8596700 (LWP 22468) pthread_cond_wait
    @GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
    13 Thread 0x7f73bc59e700 (LWP 22440) pthread_cond_wait@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
    14 Thread 0x7f73b6592700 (LWP 22464) pthread_cond_wait
    @GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
    15 Thread 0x7f73b5590700 (LWP 22462) pthread_cond_wait@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
    16 Thread 0x7f73bbd9d700 (LWP 22441) pthread_cond_wait
    @GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
    17 Thread 0x7f73bb59c700 (LWP 22442) pthread_cond_wait@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
    18 Thread 0x7f73bad9b700 (LWP 22443) pthread_cond_wait
    @GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
    19 Thread 0x7f73b7d95700 (LWP 22469) pthread_cond_wait@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
    20 Thread 0x7f73affff700 (LWP 22472) pthread_cond_wait
    @GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
    21 Thread 0x7f73b6d93700 (LWP 22465) pthread_cond_wait@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
    22 Thread 0x7f73adb84700 (LWP 22461) pthread_cond_wait
    @GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
    23 Thread 0x7f73b5d91700 (LWP 22463) pthread_cond_wait@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
    24 Thread 0x7f73b9d99700 (LWP 22445) pthread_cond_wait
    @GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
    25 Thread 0x7f73aeb86700 (LWP 22459) pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225

(gdb) bt
#0 testing::internal::ActionResultHolder<void>::GetValueAndDelete (this=0x0) at /home/brad/working/src/ceph/src/gmock/include/gmock/gmock-spec-builders.h:1373
#1 testing::internal::FunctionMockerBase<void ()>::InvokeWith(std::tuple<> const&) (args=empty std::tuple, this=0x7ffdad1cc328) at /home/brad/working/src/ceph/src/gmock/include/gmock/gmock-spec-builders.h:1530
#2 testing::internal::FunctionMocker<void ()>::Invoke() (this=0x7ffdad1cc328) at /home/brad/working/src/ceph/src/gmock/include/gmock/gmock-generated-function-mockers.h:76
#3 rbd::mirror::image_sync::SyncPointCreateRequest<librbd::(anonymous namespace)::MockTestImageCtx>::send (this=0x7ffdad1cc320) at /home/brad/working/src/ceph/src/test/rbd_mirror/test_mock_ImageSync.cc:137
#4 rbd::mirror::ImageSync<librbd::(anonymous namespace)::MockTestImageCtx>::send_create_sync_point (this=this@entry=0x2c16f10) at /home/brad/working/src/ceph/src/tools/rbd_mirror/ImageSync.cc:124
#5 0x0000000000623c90 in rbd::mirror::ImageSync<librbd::(anonymous namespace)::MockTestImageCtx>::send_prune_catch_up_sync_point (this=0x2c16f10) at /home/brad/working/src/ceph/src/tools/rbd_mirror/ImageSync.cc:77
#6 0x00000000006289d7 in rbd::mirror::ImageSync<librbd::(anonymous namespace)::MockTestImageCtx>::send (this=<optimized out>) at /home/brad/working/src/ceph/src/tools/rbd_mirror/ImageSync.cc:52
#7 rbd::mirror::TestMockImageSync_SimpleSync_Test::TestBody (this=0x2be6a00) at /home/brad/working/src/ceph/src/test/rbd_mirror/test_mock_ImageSync.cc:303
#8 0x000000000073e454 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void> (location=0x99991e "the test body", method=<optimized out>, object=<optimized out>)
at /home/brad/working/src/ceph/src/gmock/gtest/src/gtest.cc:2078
#9 testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void> (object=object@entry=0x2be6a00, method=<optimized out>, location=location@entry=0x99991e "the test body")
at /home/brad/working/src/ceph/src/gmock/gtest/src/gtest.cc:2114
#10 0x0000000000736a3a in testing::Test::Run (this=0x2be6a00) at /home/brad/working/src/ceph/src/gmock/gtest/src/gtest.cc:2151
#11 0x0000000000736b88 in testing::TestInfo::Run (this=0x2baff80) at /home/brad/working/src/ceph/src/gmock/gtest/src/gtest.cc:2326
#12 0x0000000000736c65 in testing::TestCase::Run (this=0x2bb0110) at /home/brad/working/src/ceph/src/gmock/gtest/src/gtest.cc:2444
#13 0x0000000000736f37 in testing::internal::UnitTestImpl::RunAllTests (this=0x2bafb90) at /home/brad/working/src/ceph/src/gmock/gtest/src/gtest.cc:4315
#14 0x000000000073e904 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (location=0x998f50 "auxiliary test code (environments or event listeners)", method=<optimized out>,
object=<optimized out>) at /home/brad/working/src/ceph/src/gmock/gtest/src/gtest.cc:2078
#15 testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x2bafb90, method=<optimized out>, location=location@entry=0x998f50 "auxiliary test code (environments or event listeners)")
at /home/brad/working/src/ceph/src/gmock/gtest/src/gtest.cc:2114
#16 0x0000000000737254 in testing::UnitTest::Run (this=0xccbdc0 <testing::UnitTest::GetInstance()::instance>) at /home/brad/working/src/ceph/src/gmock/gtest/src/gtest.cc:3926
#17 0x00000000005dab09 in RUN_ALL_TESTS () at /home/brad/working/src/ceph/src/gmock/gtest/include/gtest/gtest.h:2288
#18 main (argc=<optimized out>, argv=<optimized out>) at /home/brad/working/src/ceph/src/test/rbd_mirror/test_main.cc:41

(gdb) t 3
[Switching to thread 3 (Thread 0x7f73b4d8f700 (LWP 22471))]
#0 testing::internal::ActionResultHolder<void>::GetValueAndDelete (this=0x0) at /home/brad/working/src/ceph/src/gmock/include/gmock/gmock-spec-builders.h:1373
1373 void GetValueAndDelete() const { delete this; }
(gdb) bt
#0 testing::internal::ActionResultHolder<void>::GetValueAndDelete (this=0x0) at /home/brad/working/src/ceph/src/gmock/include/gmock/gmock-spec-builders.h:1373
#1 testing::internal::FunctionMockerBase<void ()>::InvokeWith(std::tuple<> const&) (args=empty std::tuple, this=<optimized out>) at /home/brad/working/src/ceph/src/gmock/include/gmock/gmock-spec-builders.h:1530
#2 testing::internal::FunctionMocker<void ()>::Invoke() (this=<optimized out>) at /home/brad/working/src/ceph/src/gmock/include/gmock/gmock-generated-function-mockers.h:76
#3 0x00000000006231a2 in rbd::mirror::image_sync::SnapshotCopyRequest<librbd::(anonymous namespace)::MockTestImageCtx>::send (this=0x7ffdad1cc4b0) at /home/brad/working/src/ceph/src/test/rbd_mirror/test_mock_ImageSync.cc:114
#4 rbd::mirror::ImageSync<librbd::(anonymous namespace)::MockTestImageCtx>::send_copy_snapshots (this=this@entry=0x2c16f10) at /home/brad/working/src/ceph/src/tools/rbd_mirror/ImageSync.cc:162
#5 0x00000000006232be in rbd::mirror::ImageSync<librbd::(anonymous namespace)::MockTestImageCtx>::handle_create_sync_point (this=0x2c16f10, r=0) at /home/brad/working/src/ceph/src/tools/rbd_mirror/ImageSync.cc:138
#6 0x0000000000611959 in Context::complete (this=0x2c16930, r=<optimized out>) at /home/brad/working/src/ceph/src/include/Context.h:64
#7 0x00000000009657f5 in ThreadPool::worker (this=0x2bcdf50, wt=<optimized out>) at /home/brad/working/src/ceph/src/common/WorkQueue.cc:132
#8 0x00000000009669f0 in ThreadPool::WorkThread::entry (this=<optimized out>) at /home/brad/working/src/ceph/src/common/WorkQueue.h:445
#9 0x00007f73c4ac25ca in start_thread (arg=0x7f73b4d8f700) at pthread_create.c:333
#10 0x00007f73c0eddead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109@

Actions #1

Updated by Kefu Chai almost 8 years ago

  • Assignee set to Jason Dillaman

jason, mind taking a look?

Actions #2

Updated by Jason Dillaman almost 8 years ago

  • Project changed from Ceph to rbd
  • Category deleted (librbd)
Actions #3

Updated by Jason Dillaman almost 8 years ago

  • Status changed from New to Need More Info

@Brad: are you able to repeat this issue?

Actions #4

Updated by Brad Hubbard almost 8 years ago

@Jason Borden yes, I can

Actions #5

Updated by Brad Hubbard almost 8 years ago

Hi @Jason Borden,

Following our discussion on IRC I did the following.

  1. git clone --recursive https://github.com/ceph/ceph.git
  2. cd ceph
  3. ./do_cmake.sh
  4. cd build/
  5. make -j6
  6. make j6 check
    $ CEPH_LIB=lib ./bin/unittest_rbd_mirror
    [==========] Running 120 tests from 15 test cases.
    [---------
    ] Global test environment set-up.
    [----------] 1 test from TestMockImageReplayer
    [ RUN ] TestMockImageReplayer.Blah
    seed 13200
    [ OK ] TestMockImageReplayer.Blah (8 ms)
    [----------] 1 test from TestMockImageReplayer (8 ms total)

[----------] 5 tests from TestMockImageSync
[ RUN ] TestMockImageSync.SimpleSync
Segmentation fault (core dumped)

Actions #6

Updated by Brad Hubbard almost 8 years ago

I spent a lot of time looking at this today but I couldn't pin it down but I do have some findings.

The following two threads are present at the time of each segfault.

(gdb) t a 1 bt

Thread 1 (Thread 0x7ffff7fb8740 (LWP 32089)):
#0  testing::internal::ActionResultHolder<void>::GetValueAndDelete (this=0x0) at /home/brad/working/src/ceph/src/gmock/include/gmock/gmock-spec-builders.h:1373
#1  testing::internal::FunctionMockerBase<void ()>::InvokeWith(std::tuple<> const&) (args=empty std::tuple, this=0x7fffffffa8c8) at /home/brad/working/src/ceph/src/gmock/include/gmock/gmock-spec-builders.h:1530
#2  testing::internal::FunctionMocker<void ()>::Invoke() (this=0x7fffffffa8c8) at /home/brad/working/src/ceph/src/gmock/include/gmock/gmock-generated-function-mockers.h:76
#3  rbd::mirror::image_sync::SyncPointCreateRequest<librbd::(anonymous namespace)::MockTestImageCtx>::send (this=0x7fffffffa8c0) at /home/brad/working/src/ceph/src/test/rbd_mirror/test_mock_ImageSync.cc:137
#4  rbd::mirror::ImageSync<librbd::(anonymous namespace)::MockTestImageCtx>::send_create_sync_point (this=this@entry=0xd4c930) at /home/brad/working/src/ceph/src/tools/rbd_mirror/ImageSync.cc:128
#5  0x0000000000623c40 in rbd::mirror::ImageSync<librbd::(anonymous namespace)::MockTestImageCtx>::send_prune_catch_up_sync_point (this=this@entry=0xd4c930) at /home/brad/working/src/ceph/src/tools/rbd_mirror/ImageSync.cc:79
#6  0x0000000000623cb2 in rbd::mirror::ImageSync<librbd::(anonymous namespace)::MockTestImageCtx>::send (this=0xd4c930) at /home/brad/working/src/ceph/src/tools/rbd_mirror/ImageSync.cc:53
#7  0x0000000000628997 in rbd::mirror::TestMockImageSync_SimpleSync_Test::TestBody (this=0xcfe370) at /home/brad/working/src/ceph/src/test/rbd_mirror/test_mock_ImageSync.cc:303
#8  0x000000000073e3a4 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void> (location=0x99987e "the test body", method=<optimized out>, object=<optimized out>)
    at /home/brad/working/src/ceph/src/gmock/gtest/src/gtest.cc:2078
#9  testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void> (object=object@entry=0xcfe370, method=<optimized out>, location=location@entry=0x99987e "the test body")
    at /home/brad/working/src/ceph/src/gmock/gtest/src/gtest.cc:2114
#10 0x000000000073698a in testing::Test::Run (this=0xcfe370) at /home/brad/working/src/ceph/src/gmock/gtest/src/gtest.cc:2151
#11 0x0000000000736ad8 in testing::TestInfo::Run (this=0xce1f40) at /home/brad/working/src/ceph/src/gmock/gtest/src/gtest.cc:2326
#12 0x0000000000736bb5 in testing::TestCase::Run (this=0xce20d0) at /home/brad/working/src/ceph/src/gmock/gtest/src/gtest.cc:2444
#13 0x0000000000736e87 in testing::internal::UnitTestImpl::RunAllTests (this=0xce1b90) at /home/brad/working/src/ceph/src/gmock/gtest/src/gtest.cc:4315
#14 0x000000000073e854 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (location=0x998eb0 "auxiliary test code (environments or event listeners)", method=<optimized out>, 
    object=<optimized out>) at /home/brad/working/src/ceph/src/gmock/gtest/src/gtest.cc:2078
#15 testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0xce1b90, method=<optimized out>, location=location@entry=0x998eb0 "auxiliary test code (environments or event listeners)")
    at /home/brad/working/src/ceph/src/gmock/gtest/src/gtest.cc:2114
#16 0x00000000007371a4 in testing::UnitTest::Run (this=0xccbdc0 <testing::UnitTest::GetInstance()::instance>) at /home/brad/working/src/ceph/src/gmock/gtest/src/gtest.cc:3926
#17 0x00000000005daab9 in RUN_ALL_TESTS () at /home/brad/working/src/ceph/src/gmock/gtest/include/gtest/gtest.h:2288
#18 main (argc=<optimized out>, argv=<optimized out>) at /home/brad/working/src/ceph/src/test/rbd_mirror/test_main.cc:41

(gdb) t a 37 bt

Thread 37 (Thread 0x7fffc2ffd700 (LWP 32128)):
#0  testing::internal::FunctionMockerBase<void ()>::InvokeWith(std::tuple<> const&) (args=empty std::tuple, this=0x7fffffffaa58) at /home/brad/working/src/ceph/src/gmock/include/gmock/gmock-spec-builders.h:1530
#1  testing::internal::FunctionMocker<void ()>::Invoke() (this=0x7fffffffaa58) at /home/brad/working/src/ceph/src/gmock/include/gmock/gmock-generated-function-mockers.h:76
#2  rbd::mirror::image_sync::SnapshotCopyRequest<librbd::(anonymous namespace)::MockTestImageCtx>::send (this=0x7fffffffaa50) at /home/brad/working/src/ceph/src/test/rbd_mirror/test_mock_ImageSync.cc:114
#3  rbd::mirror::ImageSync<librbd::(anonymous namespace)::MockTestImageCtx>::send_copy_snapshots (this=this@entry=0xd4c930) at /home/brad/working/src/ceph/src/tools/rbd_mirror/ImageSync.cc:163
#4  0x000000000062324e in rbd::mirror::ImageSync<librbd::(anonymous namespace)::MockTestImageCtx>::handle_create_sync_point (this=0xd4c930, r=0) at /home/brad/working/src/ceph/src/tools/rbd_mirror/ImageSync.cc:142
#5  0x0000000000611909 in Context::complete (this=0xd4c350, r=<optimized out>) at /home/brad/working/src/ceph/src/include/Context.h:64
#6  0x0000000000965745 in ThreadPool::worker (this=0xd05ec0, wt=<optimized out>) at /home/brad/working/src/ceph/src/common/WorkQueue.cc:132
#7  0x0000000000966940 in ThreadPool::WorkThread::entry (this=<optimized out>) at /home/brad/working/src/ceph/src/common/WorkQueue.h:445
#8  0x00007fffeefa65ca in start_thread (arg=0x7fffc2ffd700) at pthread_create.c:333
#9  0x00007fffeb3bfead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Note that thread 37 and thread 1 both have a "this" pointer 0xd4c930 so I suspect they are racing however I threw most of my bag of tricks at them but couldn't prove it. Not sure where to go from here?

Actions #7

Updated by Jason Dillaman almost 8 years ago

Turns out that I am on F23 -- I thought I had already upgraded. I found a similar bug report where GCC 6 + F24 results in a similar gmock failure under optimized builds (https://issues.apache.org/jira/browse/MESOS-4983). I am going to upgrade and see if I can reproduce it.

Actions #8

Updated by Jason Dillaman almost 8 years ago

Yup -- able to instantly reproduce under F24.

Actions #9

Updated by Jason Dillaman almost 8 years ago

Using the latest gmock/gtest environment appears to fix the issue.

Actions #10

Updated by Jason Dillaman almost 8 years ago

Work for switching to the newer googletest framework was already in-progress:

PR: https://github.com/ceph/ceph/pull/9134

Actions #11

Updated by Jason Dillaman almost 8 years ago

@Brad: Josh just merged the upgraded googletest/googlemock changes. It works for me now, but can you pull the latest master branch and retest to verify?

Actions #12

Updated by Brad Hubbard almost 8 years ago

  • Status changed from Need More Info to Resolved

Damn, I thought about a problem in the gmock/gtest code but dismissed it as "unlikely"

Sure enough, it works fine now.

$ CEPH_LIB=lib ./bin/unittest_rbd_mirror|tail -1
[ PASSED ] 120 tests.

Actions #13

Updated by Jason Dillaman almost 8 years ago

Always some problems when on the bleeding edge.

Actions

Also available in: Atom PDF