Project

General

Profile

Actions

Bug #5897

closed

ceph_test_rados_api_watch_notify hang on LibRadosWatchNotify.WatchNotifyTimeoutTestPP

Added by Samuel Just over 10 years ago. Updated over 10 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2013-08-06T04:50:26.730 INFO:teuthology.task.thrashosds.thrasher:Added osd 1
2013-08-06T04:50:27.345 INFO:teuthology.task.workunit.client.0.out:[10.214.132.32]: [ OK ] LibRadosStat.ClusterStat (1737 ms)
2013-08-06T04:50:27.345 INFO:teuthology.task.workunit.client.0.out:[10.214.132.32]: [ RUN ] LibRadosStat.ClusterStatPP
2013-08-06T04:50:29.635 INFO:teuthology.task.workunit.client.0.out:[10.214.132.32]: [ OK ] LibRadosStat.ClusterStatPP (2290 ms)
2013-08-06T04:50:29.635 INFO:teuthology.task.workunit.client.0.out:[10.214.132.32]: [ RUN ] LibRadosStat.PoolStat
2013-08-06T04:50:31.730 INFO:teuthology.task.thrashosds.thrasher:in_osds: [3, 2, 5, 4, 1] out_osds: [0] dead_osds: [] live_osds: [1, 3, 5, 0, 2, 4]
2013-08-06T04:50:31.730 INFO:teuthology.task.thrashosds.thrasher:choose_action: min_in 2 min_out 0 min_live 2 min_dead 0
2013-08-06T04:50:31.731 INFO:teuthology.task.thrashosds.thrasher:Adding osd 0
2013-08-06T04:50:31.731 DEBUG:teuthology.misc:with jobid basedir: 97106
2013-08-06T04:50:31.731 DEBUG:teuthology.orchestra.run:Running [10.214.132.31]: '/home/ubuntu/cephtest/97106/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/97106/archive/coverage ceph osd in 0'
2013-08-06T04:50:33.095 INFO:teuthology.task.workunit.client.0.out:[10.214.132.32]: [ OK ] LibRadosStat.PoolStat (3460 ms)
2013-08-06T04:50:33.095 INFO:teuthology.task.workunit.client.0.out:[10.214.132.32]: [ RUN ] LibRadosStat.PoolStatPP
2013-08-06T04:50:33.111 INFO:teuthology.orchestra.run.err:[10.214.132.31]: marked in osd.0.
2013-08-06T04:50:33.121 INFO:teuthology.task.thrashosds.thrasher:Added osd 0
2013-08-06T04:50:36.288 INFO:teuthology.task.workunit.client.0.out:[10.214.132.32]: [ OK ] LibRadosStat.PoolStatPP (3193 ms)
2013-08-06T04:50:36.288 INFO:teuthology.task.workunit.client.0.out:[10.214.132.32]: [----------] 8 tests from LibRadosStat (27984 ms total)
2013-08-06T04:50:36.289 INFO:teuthology.task.workunit.client.0.out:[10.214.132.32]:
2013-08-06T04:50:36.289 INFO:teuthology.task.workunit.client.0.out:[10.214.132.32]: [----------] Global test environment tear-down
2013-08-06T04:50:36.289 INFO:teuthology.task.workunit.client.0.out:[10.214.132.32]: [==========] 8 tests from 1 test case ran. (27984 ms total)
2013-08-06T04:50:36.289 INFO:teuthology.task.workunit.client.0.out:[10.214.132.32]: [ PASSED ] 8 tests.
2013-08-06T04:50:36.290 INFO:teuthology.task.workunit.client.0.err:[10.214.132.32]: + ceph_test_rados_api_watch_notify
2013-08-06T04:50:36.302 INFO:teuthology.task.workunit.client.0.out:[10.214.132.32]: Running main() from gtest_main.cc
2013-08-06T04:50:36.302 INFO:teuthology.task.workunit.client.0.out:[10.214.132.32]: [==========] Running 3 tests from 1 test case.
2013-08-06T04:50:36.302 INFO:teuthology.task.workunit.client.0.out:[10.214.132.32]: [----------] Global test environment set-up.
2013-08-06T04:50:36.302 INFO:teuthology.task.workunit.client.0.out:[10.214.132.32]: [----------] 3 tests from LibRadosWatchNotify
2013-08-06T04:50:36.303 INFO:teuthology.task.workunit.client.0.out:[10.214.132.32]: [ RUN ] LibRadosWatchNotify.WatchNotifyTest
2013-08-06T04:50:38.120 INFO:teuthology.task.thrashosds.thrasher:in_osds: [3, 2, 5, 4, 1, 0] out_osds: [] dead_osds: [] live_osds: [1, 3, 5, 0, 2, 4]
2013-08-06T04:50:38.121 INFO:teuthology.task.thrashosds.thrasher:choose_action: min_in 2 min_out 0 min_live 2 min_dead 0
2013-08-06T04:50:38.121 INFO:teuthology.task.thrashosds.thrasher:inject_pause on 2
2013-08-06T04:50:38.121 INFO:teuthology.task.thrashosds.thrasher:Testing filestore_inject_stall pause injection for duration 3
2013-08-06T04:50:38.121 INFO:teuthology.task.thrashosds.thrasher:Checking after 0, should_be_down=False
2013-08-06T04:50:38.121 DEBUG:teuthology.misc:with jobid basedir: 97106
2013-08-06T04:50:38.122 DEBUG:teuthology.orchestra.run:Running [10.214.132.31]: 'sudo /home/ubuntu/cephtest/97106/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/97106/archive/coverage ceph --admin-daemon /var/run/ceph/ceph-osd.2.asok config set filestore_inject_stall 3'
2013-08-06T04:50:39.791 INFO:teuthology.task.workunit.client.0.out:[10.214.132.32]: [ OK ] LibRadosWatchNotify.WatchNotifyTest (3489 ms)
2013-08-06T04:50:39.791 INFO:teuthology.task.workunit.client.0.out:[10.214.132.32]: [ RUN ] LibRadosWatchNotify.WatchNotifyTestPP
2013-08-06T04:50:43.233 INFO:teuthology.task.thrashosds.thrasher:in_osds: [3, 2, 5, 4, 1, 0] out_osds: [] dead_osds: [] live_osds: [1, 3, 5, 0, 2, 4]
2013-08-06T04:50:43.233 INFO:teuthology.task.thrashosds.thrasher:choose_action: min_in 2 min_out 0 min_live 2 min_dead 0
2013-08-06T04:50:43.233 INFO:teuthology.task.thrashosds.thrasher:Killing osd 2, live_osds are [1, 3, 5, 0, 2, 4]
2013-08-06T04:50:43.234 DEBUG:teuthology.task.ceph.osd.2:waiting for process to exit
2013-08-06T04:50:43.247 INFO:teuthology.task.ceph.osd.2:Stopped
2013-08-06T04:50:43.604 INFO:teuthology.task.workunit.client.0.out:[10.214.132.32]: [ OK ] LibRadosWatchNotify.WatchNotifyTestPP (3813 ms)
2013-08-06T04:50:43.604 INFO:teuthology.task.workunit.client.0.out:[10.214.132.32]: [ RUN ] LibRadosWatchNotify.WatchNotifyTimeoutTestPP
2013-08-06T04:50:48.247 INFO:teuthology.task.thrashosds.thrasher:in_osds: [3, 2, 5, 4, 1, 0] out_osds: [] dead_osds: [2] live_osds: [1, 3, 5, 0, 4]
2013-08-06T04:50:48.247 INFO:teuthology.task.thrashosds.thrasher:choose_action: min_in 2 min_out 0 min_live 2 min_dead 0
2013-08-06T04:50:48.247 INFO:teuthology.task.thrashosds.thrasher:fixing pg num pool data
2013-08-06T04:50:48.248 DEBUG:teuthology.misc:with jobid basedir: 97106
2013-08-06T04:50:48.248 DEBUG:teuthology.orchestra.run:Running [10.214.132.31]: '/home/ubuntu/cephtest/97106/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/97106/archive/coverage ceph pg dump --format=json'
2013-08-06T04:50:48.519 INFO:teuthology.orchestra.run.err:[10.214.132.31]: dumped all in format json
2013-08-06T04:50:53.527 INFO:teuthology.task.thrashosds.thrasher:in_osds: [3, 2, 5, 4, 1, 0] out_osds: [] dead_osds: [2] live_osds: [1, 3, 5, 0, 4]
2013-08-06T04:50:53.527 INFO:teuthology.task.thrashosds.thrasher:Reviving osd 2
2013-08-06T04:50:53.527 INFO:teuthology.task.ceph.osd.2:Restarting
2013-08-06T04:50:53.528 DEBUG:teuthology.orchestra.run:Running [10.214.132.31]: '/home/ubuntu/cephtest/97106/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/97106/archive/coverage sudo /home/ubuntu/cephtest/97106/daemon-helper kill ceph-osd -f -i 2'
2013-08-06T04:50:53.531 INFO:teuthology.task.ceph.osd.2:Started
2013-08-06T04:50:53.531 DEBUG:teuthology.misc:with jobid basedir: 97106

Thread 9 (Thread 0x7f7df2606700 (LWP 3322)):
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1 0x00007f7df43778cb in ceph::log::Log::entry (this=0x2048280) at log/Log.cc:323
#2 0x00007f7df3ef0e9a in start_thread (arg=0x7f7df2606700) at pthread_create.c:308
#3 0x00007f7df3707ccd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#4 0x0000000000000000 in ?? ()

Thread 8 (Thread 0x7f7df1e05700 (LWP 3323)):
#0 sem_timedwait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_timedwait.S:102
#1 0x00007f7df4471878 in CephContextServiceThread::entry (this=0x2042930) at common/ceph_context.cc:57
#2 0x00007f7df3ef0e9a in start_thread (arg=0x7f7df1e05700) at pthread_create.c:308
#3 0x00007f7df3707ccd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#4 0x0000000000000000 in ?? ()

Thread 7 (Thread 0x7f7df1604700 (LWP 3324)):
#0 0x00007f7df36fc313 in GI_poll (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:87
#1 0x00007f7df42fcc9b in AdminSocket::entry (this=0x204cf60) at common/admin_socket.cc:230
#2 0x00007f7df3ef0e9a in start_thread (arg=0x7f7df1604700) at pthread_create.c:308
#3 0x00007f7df3707ccd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#4 0x0000000000000000 in ?? ()

Thread 6 (Thread 0x7f7df0e03700 (LWP 3325)):
#0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:132
#1 0x00007f7df3ef3065 in _L_lock_858 () from /lib/x86_64-linux-gnu/libpthread.so.0
#2 0x00007f7df3ef2eba in __pthread_mutex_lock (mutex=0x204e720) at pthread_mutex_lock.c:61
#3 0x00007f7df42f0f83 in Mutex::Lock (this=0x204e710, no_lockdep=<optimized out>) at common/Mutex.cc:89
#4 0x00007f7df427ce2c in Locker (m=..., this=<synthetic pointer>) at ./common/Mutex.h:120
#5 librados::RadosClient::ms_dispatch (this=0x204dfb0, m=0x7f7dd4000f40) at librados/RadosClient.cc:282
#6 0x00007f7df43a6849 in ms_deliver_dispatch (m=0x7f7dd4000f40, this=0x2058c10) at msg/Messenger.h:587
#7 DispatchQueue::entry (this=0x2058cf8) at msg/DispatchQueue.cc:123
#8 0x00007f7df441686d in DispatchQueue::DispatchThread::entry (this=<optimized out>) at msg/DispatchQueue.h:104
#9 0x00007f7df3ef0e9a in start_thread (arg=0x7f7df0e03700) at pthread_create.c:308
#10 0x00007f7df3707ccd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#11 0x0000000000000000 in ?? ()

Thread 5 (Thread 0x7f7debfff700 (LWP 3326)):
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1 0x00007f7df4411899 in Wait (mutex=..., this=0x2059120) at ./common/Cond.h:55
#2 SimpleMessenger::reaper_entry (this=0x2058c10) at msg/SimpleMessenger.cc:205
#3 0x00007f7df4416d9d in SimpleMessenger::ReaperThread::entry (this=<optimized out>) at msg/SimpleMessenger.h:225
#4 0x00007f7df3ef0e9a in start_thread (arg=0x7f7debfff700) at pthread_create.c:308
#5 0x00007f7df3707ccd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6 0x0000000000000000 in ?? ()

Thread 4 (Thread 0x7f7deb7fe700 (LWP 3327)):
#0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:215
#1 0x00007f7df43116ab in WaitUntil (when=..., mutex=..., this=0x204e378) at common/Cond.h:71
#2 SafeTimer::timer_thread (this=0x204e368) at common/Timer.cc:114
#3 0x00007f7df431205d in SafeTimerThread::entry (this=<optimized out>) at common/Timer.cc:38
#4 0x00007f7df3ef0e9a in start_thread (arg=0x7f7deb7fe700) at pthread_create.c:308
#5 0x00007f7df3707ccd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6 0x0000000000000000 in ?? ()

Thread 3 (Thread 0x7f7deaffd700 (LWP 3328)):
#0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:132
#1 0x00007f7df3ef3065 in _L_lock_858 () from /lib/x86_64-linux-gnu/libpthread.so.0
#2 0x00007f7df3ef2eba in __pthread_mutex_lock (mutex=0x204e720) at pthread_mutex_lock.c:61
#3 0x00007f7df42f0f83 in Mutex::Lock (this=0x204e710, no_lockdep=<optimized out>) at common/Mutex.cc:89
#4 0x00007f7df429a027 in Locker (m=..., this=<synthetic pointer>) at ./common/Mutex.h:120
#5 Objecter::C_Linger_Map_Latest::finish (this=0x7f7ddc006330, r=<optimized out>) at osdc/Objecter.cc:781
#6 0x00007f7df426f5ea in Context::complete (this=0x7f7ddc006330, r=<optimized out>) at ./include/Context.h:42
#7 0x00007f7df4312890 in Finisher::finisher_thread_entry (this=0x204e430) at common/Finisher.cc:56
#8 0x00007f7df3ef0e9a in start_thread (arg=0x7f7deaffd700) at pthread_create.c:308
#9 0x00007f7df3707ccd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#10 0x0000000000000000 in ?? ()

Thread 2 (Thread 0x7f7dea7fc700 (LWP 3333)):
#0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:132
#1 0x00007f7df3efa2b8 in _L_cond_lock_874 () from /lib/x86_64-linux-gnu/libpthread.so.0
#2 0x00007f7df3efa088 in __pthread_mutex_cond_lock (mutex=0x204e720) at ../nptl/pthread_mutex_lock.c:61
#3 0x00007f7df3ef51b2 in pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:302
#4 0x00007f7df43116ab in WaitUntil (when=..., mutex=..., this=0x204e7b8) at common/Cond.h:71
#5 SafeTimer::timer_thread (this=0x204e7a8) at common/Timer.cc:114
#6 0x00007f7df431205d in SafeTimerThread::entry (this=<optimized out>) at common/Timer.cc:38
#7 0x00007f7df3ef0e9a in start_thread (arg=0x7f7dea7fc700) at pthread_create.c:308
#8 0x00007f7df3707ccd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#9 0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7f7df516b780 (LWP 3276)):
#0 0x00007f7df3ef2148 in pthread_join (threadid=140178790274816, thread_return=0x0) at pthread_join.c:89
#1 0x00007f7df4478142 in Thread::join (this=0x204e548, prval=<optimized out>) at common/Thread.cc:121
#2 0x00007f7df438b6bf in MonClient::shutdown (this=0x204e180) at mon/MonClient.cc:334
#3 0x00007f7df427806f in librados::RadosClient::shutdown (this=0x204dfb0) at librados/RadosClient.cc:232
#4 0x00007f7df42653a8 in librados::Rados::shutdown (this=0x7fff5726de20) at librados/librados.cc:1348
#5 0x0000000000419f76 in destroy_one_pool_pp (pool_name=..., cluster=...) at test/librados/test.cc:147
#6 0x0000000000414cdf in LibRadosWatchNotify_WatchNotifyTimeoutTestPP_Test::TestBody (this=<optimized out>) at test/librados/watch_notify.cc:90
#7 0x000000000042e67a in testing::Test::Run (this=0x203f770) at ./src/gtest.cc:2095
#8 0x000000000042e780 in testing::internal::TestInfoImpl::Run (this=0x203f120) at ./src/gtest.cc:2314
#9 0x000000000042e84d in testing::TestCase::Run (this=0x203ee40) at ./src/gtest.cc:2420
#10 0x000000000042eab7 in testing::internal::UnitTestImpl::RunAllTests (this=0x203eab0) at ./src/gtest.cc:4024
#11 0x0000000000414225 in main (argc=1, argv=0x7fff5726e118) at src/gtest_main.cc:38

ubuntu@teuthology:/a/teuthology-2013-08-05_01:00:16-rados-next-testing-basic-plana/97106

Actions #1

Updated by Sage Weil over 10 years ago

  • Status changed from New to Pending Backport
Actions #2

Updated by Sage Weil over 10 years ago

  • Status changed from Pending Backport to Resolved
  • Source changed from other to Q/A

fix preceded dumpling; backported to cuttlefish branch

Actions

Also available in: Atom PDF