Bug #43861
closedceph_test_rados_watch_notify hang
0%
Description
[root@smithi130 tmp]# cat ceph_test_rados_watch_notify.log process_1_[25350]: starting. process_1_[25350]: creating pool foo.25320 process_1_[25350]: created object 0... process_1_[25350]: finishing. process_1_[25350]: shutting down. process_3_[25352]: starting. process_3_[25352]: notifying object 0.obj process_3_[25352]: shutting down. process_4_[25353]: starting. process_4_[25353]: deleting pool foo.25320 process_4_[25353]: shutting down. process_2_[25351]: starting. process_2_[25351]: watching object 0.obj process_2_[25351]: shutting down. ******************************* process_5_[26143]: starting. process_5_[26143]: creating pool foo.25320. process_5_[26143]: finishing. process_5_[26143]: shutting down. ******************************* process_7_[26145]: starting. process_7_[26145]: notifying object 0.obj process_7_[26145]: shutting down. ******************************* process_6_[26144]: starting. process_6_[26144]: watching object 0.obj process_6_[26144]: shutting down. ******************************* process_8_[26146]: starting. process_8_[26146]: deleting pool foo.25320. process_8_[26146]: shutting down. ******************************* ******************************* process_9_[26236]: starting. process_9_[26236]: creating pool foo.25320.. process_9_[26236]: created object 0... process_9_[26236]: finishing. process_9_[26236]: shutting down. ******************************* ******************************* process_11_[26238]: starting. process_11_[26238]: notifying object 0.obj process_11_[26238]: file /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/huge/release/15.0.0-9869-g4b944a6/rpm/el8/BUILD/ceph-15.0.0-9869-g4b944a6/src/test/system/st_rados_notify.cc, line 69: expected 0, got -110 process_11_[26238]: shutting down.
that process is the zombie
25320 ? S 0:00 | \_ ceph_test_rados_watch_notify 26237 ? Sl 0:02 | | \_ ceph_test_rados_watch_notify 26238 ? Z 0:00 | | \_ [ceph_test_rados] <defunct> 26239 ? Sl 0:00 | | \_ ceph_test_rados_watch_notify 26240 ? S 0:00 | | \_ ceph_test_rados_watch_notify 26241 ? S 0:00 | | \_ ceph_test_rados_watch_notify 26242 ? Sl 0:00 | | \_ ceph_test_rados_watch_notify
Updated by Sage Weil about 4 years ago
/a/sage-2020-01-28_03:52:05-rados-wip-sage2-testing-2020-01-27-1839-distro-basic-smithi/4713217
Updated by Sage Weil about 4 years ago
same?
/a/sage-2020-02-01_03:27:35-rados-wip-sage-testing-2020-01-31-1746-distro-basic-smithi/4723146
ceph_test_watch_notify didn't generate any output.
Updated by Brad Hubbard about 4 years ago
- Related to Bug #44062: LibRadosWatchNotify.WatchNotify failure added
Updated by Brad Hubbard about 4 years ago
Almost certainly the same issue as #44062
Updated by Sage Weil about 4 years ago
2020-02-23T19:20:15.553 INFO:tasks.workunit.client.0.smithi073.stdout: watch_notify: process_1_[36775]: creating pool foo.36733 2020-02-23T19:20:15.553 INFO:tasks.workunit.client.0.smithi073.stdout: watch_notify: process_1_[36775]: created object 0... 2020-02-23T19:20:15.553 INFO:tasks.workunit.client.0.smithi073.stdout: watch_notify: process_1_[36775]: finishing. 2020-02-23T19:20:15.553 INFO:tasks.workunit.client.0.smithi073.stdout: watch_notify: process_1_[36775]: shutting down. 2020-02-23T19:20:15.554 INFO:tasks.workunit.client.0.smithi073.stdout: watch_notify: process_3_[36777]: starting. 2020-02-23T19:20:15.554 INFO:tasks.workunit.client.0.smithi073.stdout: watch_notify: process_3_[36777]: notifying object 0.obj 2020-02-23T19:20:15.562 INFO:tasks.workunit.client.0.smithi073.stdout: watch_notify: process_3_[36777]: shutting down. 2020-02-23T19:20:15.562 INFO:tasks.workunit.client.0.smithi073.stdout: watch_notify: process_2_[36776]: starting. 2020-02-23T19:20:15.563 INFO:tasks.workunit.client.0.smithi073.stdout: watch_notify: process_2_[36776]: watching object 0.obj 2020-02-23T19:20:15.563 INFO:tasks.workunit.client.0.smithi073.stdout: watch_notify: Received fewer notifies than expected: 0 < 1 2020-02-23T19:20:15.563 INFO:tasks.workunit.client.0.smithi073.stdout: watch_notify: process_2_[36776]: shutting down. 2020-02-23T19:20:15.563 INFO:tasks.workunit.client.0.smithi073.stdout: watch_notify: test1: got error: run_until_finished: runnable process_2: got error: [36733] returned exit_status (1) Operation not permitted
/a/sage-2020-02-23_16:41:52-rados-wip-sage2-testing-2020-02-23-0755-distro-basic-smithi/4793920
Updated by Sage Weil about 4 years ago
no output at all, like comment 2 above:
/a/sage-2020-03-06_17:29:42-rados-wip-sage4-testing-2020-03-05-1645-distro-basic-smithi/4831763
Updated by Sage Weil about 4 years ago
41404 ? S 0:00 | | \_ bash -o pipefail -exc ceph_test_rados_watch_notify 2>&1 | tee ceph_test_rados_watch_notify.log | sed "s/^/ watch_notify: /" 41437 ? S 0:00 | | \_ ceph_test_rados_watch_notify 41470 ? Sl 0:01 | | | \_ ceph_test_rados_watch_notify 41471 ? Z 0:00 | | | \_ [ceph_test_rados] <defunct> 41472 ? Sl 0:00 | | | \_ ceph_test_rados_watch_notify 41438 ? S 0:00 | | \_ tee ceph_test_rados_watch_notify.log 41439 ? S 0:00 | | \_ sed s/^/ watch_notify: / ... [root@smithi104 tmp]# gdb ceph_test_rados_watch_notify 41437 ... Thread 1 (Thread 0x7f1b496d6640 (LWP 41437)): #0 0x00007f1b48f4fb52 in read () from /lib64/libpthread.so.0 #1 0x00007f1b401f90ef in read (__nbytes=4, __buf=0x7ffca63a5068, __fd=4) at /usr/include/bits/unistd.h:44 #2 safe_read (fd=4, buf=buf@entry=0x7ffca63a5068, count=count@entry=4) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/safe_io.c:30 #3 0x00007f1b401f914d in safe_read_exact (fd=<optimized out>, buf=buf@entry=0x7ffca63a5068, count=count@entry=4) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/safe_io.c:48 #4 0x00005599eb02ceda in Preforker::parent_wait (this=this@entry=0x7ffca63a5e08, err_msg="") at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/Preforker.h:82 #5 0x00005599eb02bd55 in SysTestRunnable::join[abi:cxx11]() (this=0x7ffca63a5df0) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/test/system/systest_runnable.cc:131 #6 0x00005599eb02c789 in SysTestRunnable::run_until_finished[abi:cxx11](std::vector<SysTestRunnable*, std::allocator<SysTestRunnable*> >&) (runnables=std::vector of length 4, capacity 4 = {...}) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/test/system/systest_runnable.cc:154 #7 0x00005599eb027beb in main (argc=1, argv=0x7ffca63a61e8) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/test/system/rados_watch_notify.cc:82 ... [root@smithi104 tmp]# gdb ceph_test_rados_watch_notify 41470 ... (gdb) thr app all bt Thread 15 (Thread 0x7f1b20ff9700 (LWP 44923)): #0 0x00007f1b48f4c47c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f1b3fe13f04 in ceph::condition_variable_debug::wait (this=this@entry=0x5599ed157948, lock=...) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/mutex_debug.h:147 #2 0x00007f1b3fd52b2b in Finisher::finisher_thread_entry (this=0x5599ed1578e0) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/Finisher.cc:86 #3 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #4 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 14 (Thread 0x7f1b217fa700 (LWP 44922)): #0 0x00007f1b48f4c47c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f1b3fe13f04 in ceph::condition_variable_debug::wait (this=this@entry=0x5599ed157770, lock=...) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/mutex_debug.h:147 #2 0x00007f1b3fda0fd1 in SafeTimer::timer_thread (this=0x5599ed157758) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/Timer.cc:106 #3 0x00007f1b3fda2871 in SafeTimerThread::entry (this=<optimized out>) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/Timer.cc:30 #4 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #5 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 13 (Thread 0x7f1b21ffb700 (LWP 44921)): #0 0x00007f1b48f4c47c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f1b3fe13f04 in ceph::condition_variable_debug::wait (this=this@entry=0x5599ed157448, lock=...) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/mutex_debug.h:147 #2 0x00007f1b3fda0fd1 in SafeTimer::timer_thread (this=0x5599ed157430) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/Timer.cc:106 #3 0x00007f1b3fda2871 in SafeTimerThread::entry (this=<optimized out>) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/Timer.cc:30 #4 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #5 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 12 (Thread 0x7f1b227fc700 (LWP 44912)): #0 0x00007f1b48f4c47c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f1b3fe13f04 in ceph::condition_variable_debug::wait (this=this@entry=0x5599ed156ef8, lock=...) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/mutex_debug.h:147 #2 0x00007f1b3fd52b2b in Finisher::finisher_thread_entry (this=0x5599ed156e90) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/Finisher.cc:86 #3 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #4 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 11 (Thread 0x7f1b22ffd700 (LWP 44911)): #0 0x00007f1b48f4c7ca in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f1b3fe14124 in ceph::condition_variable_debug::_wait_until (this=0x5599ed156de0, mutex=0x5599ed156d68, ts=ts@entry=0x7f1b22ffaf30) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/mutex_debug.h:147 #2 0x00007f1b3fda160e in ceph::condition_variable_debug::wait_until<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > > (when=..., lock=..., this=0x5599ed156de0) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/ceph_time.h:123 #3 SafeTimer::timer_thread (this=0x5599ed156dc8) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/Timer.cc:108 #4 0x00007f1b3fda2871 in SafeTimerThread::entry (this=<optimized out>) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/Timer.cc:30 #5 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #6 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 10 (Thread 0x7f1b237fe700 (LWP 44910)): #0 0x00007f1b48f4c47c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f1b3fe13f04 in ceph::condition_variable_debug::wait (this=this@entry=0x5599ed159360, lock=...) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/mutex_debug.h:147 #2 0x00007f1b3fee4031 in DispatchQueue::run_local_delivery (this=0x5599ed159148) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/DispatchQueue.cc:117 #3 0x00007f1b3ff89a91 in DispatchQueue::LocalDeliveryThread::entry (this=<optimized out>) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/DispatchQueue.h:115 #4 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #5 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 9 (Thread 0x7f1b23fff700 (LWP 44909)): #0 0x00007f1b48f4c47c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f1b3fe13f04 in ceph::condition_variable_debug::wait (this=this@entry=0x5599ed1591b8, lock=...) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/mutex_debug.h:147 #2 0x00007f1b3fee4af8 in DispatchQueue::entry (this=0x5599ed159148) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/DispatchQueue.cc:210 #3 0x00007f1b3ff899d1 in DispatchQueue::DispatchThread::entry (this=<optimized out>) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/DispatchQueue.h:101 #4 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #5 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 8 (Thread 0x7f1b38d9d700 (LWP 44908)): #0 0x00007f1b48f4c7ca in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f1b4923e7f3 in __gthread_cond_timedwait (__abs_timeout=0x7f1b38d9af80, __mutex=<optimized out>, __cond=0x5599ed1a3a58) at /usr/include/c++/8/x86_64-redhat-linux/bits/gthr-default.h:871 #2 std::condition_variable::__wait_until_impl<std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > > (__atime=<synthetic pointer>..., __lock=..., this=0x5599ed1a3a58) at /usr/include/c++/8/condition_variable:178 #3 std::condition_variable::wait_until<ceph::time_detail::coarse_mono_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > > (__atime=..., __lock=..., this=0x5599ed1a3a58) at /usr/include/c++/8/condition_variable:119 #4 ceph::timer_detail::timer<ceph::time_detail::coarse_mono_clock>::timer_thread (this=0x5599ed1a39f0) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/ceph_timer.h:146 #5 0x00007f1b3e803b23 in execute_native_thread_routine () from /lib64/libstdc++.so.6 #6 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #7 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 7 (Thread 0x7f1b3959e700 (LWP 44907)): #0 0x00007f1b3ded5211 in poll () from /lib64/libc.so.6 #1 0x00007f1b3fdce38f in poll (__timeout=-1, __nfds=2, __fds=0x7f1b3959bf60) at /usr/include/bits/poll2.h:46 #2 AdminSocket::entry (this=0x5599ed1174f0) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/admin_socket.cc:241 #3 0x00007f1b3e803b23 in execute_native_thread_routine () from /lib64/libstdc++.so.6 #4 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #5 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 6 (Thread 0x7f1b39d9f700 (LWP 44906)): #0 0x00007f1b48f4c7ca in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f1b3fe14124 in ceph::condition_variable_debug::_wait_until (this=this@entry=0x5599ed1a1810, mutex=0x5599ed1a17b0, ts=ts@entry=0x7f1b39d9cf80) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/mutex_debug.h:147 #2 0x00007f1b3fdf1fed in ceph::condition_variable_debug::wait_for<unsigned long, std::ratio<1l, 1000000000l> > (awhile=<synthetic pointer>..., lock=..., this=0x5599ed1a1810) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/ceph_time.h:123 #3 ceph::common::CephContextServiceThread::entry (this=0x5599ed1a1790) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/ceph_context.cc:210 #4 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #5 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 5 (Thread 0x7f1b3a5a0700 (LWP 44905)): #0 0x00007f1b48f4c47c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f1b3e7fd870 in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from /lib64/libstdc++.so.6 #2 0x00007f1b40016047 in ceph::logging::Log::entry (this=0x5599ed0fa5d0) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/log/Log.cc:435 --Type <RET> for more, q to quit, c to continue without paging-- #3 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #4 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 4 (Thread 0x7f1b3ada1700 (LWP 44896)): #0 0x00007f1b3dee0467 in epoll_wait () from /lib64/libc.so.6 #1 0x00007f1b3ffe2774 in EpollDriver::event_wait (this=0x5599ed15d2d0, fired_events=std::vector of length 0, capacity 0, tvp=<optimized out>) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/async/EventEpoll.cc:123 #2 0x00007f1b3ffd433b in EventCenter::process_events (this=this@entry=0x5599ed166680, timeout_microseconds=<optimized out>, timeout_microseconds@entry=30000000, working_dur=working_dur@entry=0x7f1b3ad9ef28) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/async/Event.cc:392 #3 0x00007f1b3ffdc77c in NetworkStack::<lambda()>::operator() (__closure=0x5599ed15d1e8, __closure=0x5599ed15d1e8) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/async/Stack.cc:53 #4 std::_Function_handler<void(), NetworkStack::add_thread(unsigned int)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/8/bits/std_function.h:297 #5 0x00007f1b3e803b23 in execute_native_thread_routine () from /lib64/libstdc++.so.6 #6 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #7 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 3 (Thread 0x7f1b3b5a2700 (LWP 44895)): #0 0x00007f1b3dee0467 in epoll_wait () from /lib64/libc.so.6 #1 0x00007f1b3ffe2774 in EpollDriver::event_wait (this=0x5599ed1650c0, fired_events=std::vector of length 0, capacity 0, tvp=<optimized out>) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/async/EventEpoll.cc:123 #2 0x00007f1b3ffd433b in EventCenter::process_events (this=this@entry=0x5599ed15be70, timeout_microseconds=<optimized out>, timeout_microseconds@entry=30000000, working_dur=working_dur@entry=0x7f1b3b59ff28) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/async/Event.cc:392 #3 0x00007f1b3ffdc77c in NetworkStack::<lambda()>::operator() (__closure=0x5599ed15d1b8, __closure=0x5599ed15d1b8) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/async/Stack.cc:53 #4 std::_Function_handler<void(), NetworkStack::add_thread(unsigned int)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/8/bits/std_function.h:297 #5 0x00007f1b3e803b23 in execute_native_thread_routine () from /lib64/libstdc++.so.6 #6 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #7 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 2 (Thread 0x7f1b3bda3700 (LWP 44894)): #0 0x00007f1b3dee0467 in epoll_wait () from /lib64/libc.so.6 #1 0x00007f1b3ffe2774 in EpollDriver::event_wait (this=0x5599ed166a90, fired_events=std::vector of length 0, capacity 0, tvp=<optimized out>) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/async/EventEpoll.cc:123 #2 0x00007f1b3ffd433b in EventCenter::process_events (this=this@entry=0x5599ed19d070, timeout_microseconds=<optimized out>, timeout_microseconds@entry=30000000, working_dur=working_dur@entry=0x7f1b3bda0f28) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/async/Event.cc:392 #3 0x00007f1b3ffdc77c in NetworkStack::<lambda()>::operator() (__closure=0x5599ed15ca58, __closure=0x5599ed15ca58) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/async/Stack.cc:53 #4 std::_Function_handler<void(), NetworkStack::add_thread(unsigned int)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/8/bits/std_function.h:297 #5 0x00007f1b3e803b23 in execute_native_thread_routine () from /lib64/libstdc++.so.6 #6 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #7 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 1 (Thread 0x7f1b496d6640 (LWP 41470)): #0 0x00007f1b48f4ee56 in do_futex_wait.constprop () from /lib64/libpthread.so.0 #1 0x00007f1b48f4ef48 in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0 #2 0x00005599eb02b5e2 in CrossProcessSem::wait (this=0x5599ed101750) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/test/system/cross_process_sem.cc:65 #3 0x00005599eb02afdd in StRadosWatch::run (this=0x7ffca63a5df0) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/test/system/st_rados_watch.cc:84 #4 0x00005599eb02ba5e in systest_runnable_pthread_helper (arg=arg@entry=0x7ffca63a5df0) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/test/system/systest_runnable.cc:171 #5 0x00005599eb02c414 in SysTestRunnable::start (this=0x7ffca63a5df0) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/test/system/systest_runnable.cc:97 #6 0x00005599eb02c49a in SysTestRunnable::run_until_finished[abi:cxx11](std::vector<SysTestRunnable*, std::allocator<SysTestRunnable*> >&) (runnables=std::vector of length 4, capacity 4 = {...}) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/test/system/systest_runnable.cc:142 #7 0x00005599eb027beb in main (argc=1, argv=0x7ffca63a61e8) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/test/system/rados_watch_notify.cc:82 ... [root@smithi104 tmp]# gdb ceph_test_rados_watch_notify 41472 ... (gdb) thr app all bt Thread 15 (Thread 0x7f1b20ff9700 (LWP 41878)): #0 0x00007f1b48f4c47c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f1b3fe13f04 in ceph::condition_variable_debug::wait (this=this@entry=0x5599ed157948, lock=...) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/mutex_debug.h:147 #2 0x00007f1b3fd52b2b in Finisher::finisher_thread_entry (this=0x5599ed1578e0) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/Finisher.cc:86 #3 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #4 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 14 (Thread 0x7f1b217fa700 (LWP 41877)): #0 0x00007f1b48f4c47c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f1b3fe13f04 in ceph::condition_variable_debug::wait (this=this@entry=0x5599ed157770, lock=...) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/mutex_debug.h:147 #2 0x00007f1b3fda0fd1 in SafeTimer::timer_thread (this=0x5599ed157758) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/Timer.cc:106 #3 0x00007f1b3fda2871 in SafeTimerThread::entry (this=<optimized out>) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/Timer.cc:30 #4 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #5 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 13 (Thread 0x7f1b21ffb700 (LWP 41876)): #0 0x00007f1b48f4c47c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f1b3fe13f04 in ceph::condition_variable_debug::wait (this=this@entry=0x5599ed157448, lock=...) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/mutex_debug.h:147 #2 0x00007f1b3fda0fd1 in SafeTimer::timer_thread (this=0x5599ed157430) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/Timer.cc:106 #3 0x00007f1b3fda2871 in SafeTimerThread::entry (this=<optimized out>) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/Timer.cc:30 #4 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #5 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 12 (Thread 0x7f1b227fc700 (LWP 41666)): #0 0x00007f1b48f4c47c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f1b3fe13f04 in ceph::condition_variable_debug::wait (this=this@entry=0x5599ed156ef8, lock=...) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/mutex_debug.h:147 #2 0x00007f1b3fd52b2b in Finisher::finisher_thread_entry (this=0x5599ed156e90) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/Finisher.cc:86 #3 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #4 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 11 (Thread 0x7f1b22ffd700 (LWP 41665)): #0 0x00007f1b48f4c7ca in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f1b3fe14124 in ceph::condition_variable_debug::_wait_until (this=0x5599ed156de0, mutex=0x5599ed156d68, ts=ts@entry=0x7f1b22ffaf30) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/mutex_debug.h:147 #2 0x00007f1b3fda160e in ceph::condition_variable_debug::wait_until<ceph::time_detail::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > > (when=..., lock=..., this=0x5599ed156de0) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/ceph_time.h:123 #3 SafeTimer::timer_thread (this=0x5599ed156dc8) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/Timer.cc:108 #4 0x00007f1b3fda2871 in SafeTimerThread::entry (this=<optimized out>) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/Timer.cc:30 #5 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #6 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 10 (Thread 0x7f1b237fe700 (LWP 41664)): #0 0x00007f1b48f4c47c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f1b3fe13f04 in ceph::condition_variable_debug::wait (this=this@entry=0x5599ed159360, lock=...) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/mutex_debug.h:147 #2 0x00007f1b3fee4031 in DispatchQueue::run_local_delivery (this=0x5599ed159148) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/DispatchQueue.cc:117 #3 0x00007f1b3ff89a91 in DispatchQueue::LocalDeliveryThread::entry (this=<optimized out>) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/DispatchQueue.h:115 #4 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #5 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 9 (Thread 0x7f1b23fff700 (LWP 41663)): #0 0x00007f1b48f4c47c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f1b3fe13f04 in ceph::condition_variable_debug::wait (this=this@entry=0x5599ed1591b8, lock=...) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/mutex_debug.h:147 #2 0x00007f1b3fee4af8 in DispatchQueue::entry (this=0x5599ed159148) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/DispatchQueue.cc:210 #3 0x00007f1b3ff899d1 in DispatchQueue::DispatchThread::entry (this=<optimized out>) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/DispatchQueue.h:101 #4 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #5 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 8 (Thread 0x7f1b38d9d700 (LWP 41662)): #0 0x00007f1b48f4c7ca in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f1b4923e7f3 in __gthread_cond_timedwait (__abs_timeout=0x7f1b38d9af80, __mutex=<optimized out>, __cond=0x5599ed19bec8) at /usr/include/c++/8/x86_64-redhat-linux/bits/gthr-default.h:871 #2 std::condition_variable::__wait_until_impl<std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > > (__atime=<synthetic pointer>..., __lock=..., this=0x5599ed19bec8) at /usr/include/c++/8/condition_variable:178 #3 std::condition_variable::wait_until<ceph::time_detail::coarse_mono_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > > (__atime=..., __lock=..., this=0x5599ed19bec8) at /usr/include/c++/8/condition_variable:119 #4 ceph::timer_detail::timer<ceph::time_detail::coarse_mono_clock>::timer_thread (this=0x5599ed19be60) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/ceph_timer.h:146 #5 0x00007f1b3e803b23 in execute_native_thread_routine () from /lib64/libstdc++.so.6 #6 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #7 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 7 (Thread 0x7f1b3959e700 (LWP 41661)): #0 0x00007f1b3ded5211 in poll () from /lib64/libc.so.6 #1 0x00007f1b3fdce38f in poll (__timeout=-1, __nfds=2, __fds=0x7f1b3959bf60) at /usr/include/bits/poll2.h:46 #2 AdminSocket::entry (this=0x5599ed1174f0) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/admin_socket.cc:241 #3 0x00007f1b3e803b23 in execute_native_thread_routine () from /lib64/libstdc++.so.6 #4 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #5 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 6 (Thread 0x7f1b39d9f700 (LWP 41660)): #0 0x00007f1b48f4c7ca in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f1b3fe14124 in ceph::condition_variable_debug::_wait_until (this=this@entry=0x5599ed1a1810, mutex=0x5599ed1a17b0, ts=ts@entry=0x7f1b39d9cf80) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/mutex_debug.h:147 #2 0x00007f1b3fdf1fed in ceph::condition_variable_debug::wait_for<unsigned long, std::ratio<1l, 1000000000l> > (awhile=<synthetic pointer>..., lock=..., this=0x5599ed1a1810) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/ceph_time.h:123 #3 ceph::common::CephContextServiceThread::entry (this=0x5599ed1a1790) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/common/ceph_context.cc:210 #4 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #5 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 5 (Thread 0x7f1b3a5a0700 (LWP 41659)): #0 0x00007f1b48f4c47c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f1b3e7fd870 in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from /lib64/libstdc++.so.6 #2 0x00007f1b40016047 in ceph::logging::Log::entry (this=0x5599ed0fa5d0) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/log/Log.cc:435 --Type <RET> for more, q to quit, c to continue without paging-- #3 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #4 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 4 (Thread 0x7f1b3ada1700 (LWP 41491)): #0 0x00007f1b3dee0467 in epoll_wait () from /lib64/libc.so.6 #1 0x00007f1b3ffe2774 in EpollDriver::event_wait (this=0x5599ed15d2d0, fired_events=std::vector of length 0, capacity 0, tvp=<optimized out>) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/async/EventEpoll.cc:123 #2 0x00007f1b3ffd433b in EventCenter::process_events (this=this@entry=0x5599ed166680, timeout_microseconds=<optimized out>, timeout_microseconds@entry=30000000, working_dur=working_dur@entry=0x7f1b3ad9ef28) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/async/Event.cc:392 #3 0x00007f1b3ffdc77c in NetworkStack::<lambda()>::operator() (__closure=0x5599ed15d1e8, __closure=0x5599ed15d1e8) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/async/Stack.cc:53 #4 std::_Function_handler<void(), NetworkStack::add_thread(unsigned int)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/8/bits/std_function.h:297 #5 0x00007f1b3e803b23 in execute_native_thread_routine () from /lib64/libstdc++.so.6 #6 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #7 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 3 (Thread 0x7f1b3b5a2700 (LWP 41490)): #0 0x00007f1b3dee0467 in epoll_wait () from /lib64/libc.so.6 #1 0x00007f1b3ffe2774 in EpollDriver::event_wait (this=0x5599ed1650c0, fired_events=std::vector of length 0, capacity 0, tvp=<optimized out>) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/async/EventEpoll.cc:123 #2 0x00007f1b3ffd433b in EventCenter::process_events (this=this@entry=0x5599ed15be70, timeout_microseconds=<optimized out>, timeout_microseconds@entry=30000000, working_dur=working_dur@entry=0x7f1b3b59ff28) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/async/Event.cc:392 #3 0x00007f1b3ffdc77c in NetworkStack::<lambda()>::operator() (__closure=0x5599ed15d1b8, __closure=0x5599ed15d1b8) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/async/Stack.cc:53 #4 std::_Function_handler<void(), NetworkStack::add_thread(unsigned int)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/8/bits/std_function.h:297 #5 0x00007f1b3e803b23 in execute_native_thread_routine () from /lib64/libstdc++.so.6 #6 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #7 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 2 (Thread 0x7f1b3bda3700 (LWP 41489)): #0 0x00007f1b3dee0467 in epoll_wait () from /lib64/libc.so.6 #1 0x00007f1b3ffe2774 in EpollDriver::event_wait (this=0x5599ed166a90, fired_events=std::vector of length 0, capacity 0, tvp=<optimized out>) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/async/EventEpoll.cc:123 #2 0x00007f1b3ffd433b in EventCenter::process_events (this=this@entry=0x5599ed19d070, timeout_microseconds=<optimized out>, timeout_microseconds@entry=30000000, working_dur=working_dur@entry=0x7f1b3bda0f28) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/async/Event.cc:392 #3 0x00007f1b3ffdc77c in NetworkStack::<lambda()>::operator() (__closure=0x5599ed15ca58, __closure=0x5599ed15ca58) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/msg/async/Stack.cc:53 #4 std::_Function_handler<void(), NetworkStack::add_thread(unsigned int)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/8/bits/std_function.h:297 #5 0x00007f1b3e803b23 in execute_native_thread_routine () from /lib64/libstdc++.so.6 #6 0x00007f1b48f462de in start_thread () from /lib64/libpthread.so.0 #7 0x00007f1b3dee0133 in clone () from /lib64/libc.so.6 Thread 1 (Thread 0x7f1b496d6640 (LWP 41472)): #0 0x00007f1b48f4ee56 in do_futex_wait.constprop () from /lib64/libpthread.so.0 #1 0x00007f1b48f4ef48 in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0 #2 0x00005599eb02b5e2 in CrossProcessSem::wait (this=0x5599ed101750) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/test/system/cross_process_sem.cc:65 #3 0x00005599eb02a7c4 in StRadosDeletePool::run (this=0x7ffca63a5b10) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/test/system/st_rados_delete_pool.cc:46 #4 0x00005599eb02ba5e in systest_runnable_pthread_helper (arg=arg@entry=0x7ffca63a5b10) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/test/system/systest_runnable.cc:171 #5 0x00005599eb02c414 in SysTestRunnable::start (this=0x7ffca63a5b10) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/test/system/systest_runnable.cc:97 #6 0x00005599eb02c49a in SysTestRunnable::run_until_finished[abi:cxx11](std::vector<SysTestRunnable*, std::allocator<SysTestRunnable*> >&) (runnables=std::vector of length 4, capacity 4 = {...}) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/test/system/systest_runnable.cc:142 #7 0x00005599eb027beb in main (argc=1, argv=0x7ffca63a61e8) at /usr/src/debug/ceph-15.1.0-1950.g42e5071.el8.x86_64/src/test/system/rados_watch_notify.cc:82
Updated by Sage Weil about 4 years ago
another one line comment 5 above
2020-03-18T19:53:11.915 INFO:tasks.workunit.client.0.smithi131.stdout: watch_notify: process_3_[24529]: shutting down. 2020-03-18T19:53:11.915 INFO:tasks.workunit.client.0.smithi131.stdout: watch_notify: process_2_[24528]: starting. 2020-03-18T19:53:11.915 INFO:tasks.workunit.client.0.smithi131.stdout: watch_notify: process_2_[24528]: watching object 0.obj 2020-03-18T19:53:11.916 INFO:tasks.workunit.client.0.smithi131.stdout: watch_notify: Received fewer notifies than expected: 0 < 1 2020-03-18T19:53:11.916 INFO:tasks.workunit.client.0.smithi131.stdout: watch_notify: process_2_[24528]: shutting down. 2020-03-18T19:53:11.916 INFO:tasks.workunit.client.0.smithi131.stdout: watch_notify: test1: got error: run_until_finished: runnable process_2: got error: [24506] returned exit_status (1) Operation not permitted
/a/sage-2020-03-18_16:51:46-rados:verify-wip-sage-testing-2020-03-18-0826-distro-basic-smithi/4866547
Updated by Sage Weil about 4 years ago
i think we should consider jsut dropping this test. it's old test code written by colin almost 10 years ago and i'm not sure it's worth the effort to debug it...
Updated by Sage Weil about 4 years ago
- Status changed from New to Fix Under Review
- Pull request ID set to 34044
Updated by Sage Weil about 4 years ago
- Status changed from Fix Under Review to Resolved
Updated by Josh Durgin almost 4 years ago
- Status changed from Resolved to Pending Backport
- Backport set to octopus, nautilus
Let's remove these tests from the stable branches too.
Updated by Nathan Cutler almost 4 years ago
- Copied to Backport #46017: nautilus: ceph_test_rados_watch_notify hang added
Updated by Nathan Cutler almost 4 years ago
- Copied to Backport #46018: octopus: ceph_test_rados_watch_notify hang added
Updated by Nathan Cutler almost 4 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".