Project

General

Profile

Actions

Bug #2045

closed

osd: dout_lock deadlock

Added by Sage Weil about 12 years ago. Updated about 12 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

a thread is blocked on dout_lock, can't tell who.


Related issues 1 (0 open1 closed)

Related to Ceph - Bug #2026: osd: ceph::HeartbeatMap::check_touch_fileCan't reproduce02/06/2012

Actions
Actions #1

Updated by Sage Weil about 12 years ago

ubuntu@teuthology:/a/nightly_coverage_2012-02-09-a/11210
metropolis:~sage/bug-2045

Actions #2

Updated by Sage Weil about 12 years ago

again, although this time there is a write that looks blocked somehow

(gdb) thr app all bt

Thread 43 (Thread 5409):
#0  0x00007fb82ec8ab2d in write () at ../sysdeps/unix/syscall-template.S:82
#1  0x000000000060a170 in safe_write (fd=3, buf=0x2725058, count=131) at common/safe_io.c:59
#2  0x00000000005f39c1 in DoutStreambuf<char, std::char_traits<char> >::overflow (this=0x2725000, c=-1) at common/DoutStreambuf.cc:259
#3  0x00000000005f378f in DoutStreambuf<char, std::char_traits<char> >::sync (this=0x3) at common/DoutStreambuf.cc:416
#4  0x00007fb82dad4a9e in std::basic_ostream<char, std::char_traits<char> >::flush() () from /usr/lib/libstdc++.so.6
#5  0x00000000006e0932 in flush<char, std::char_traits<char> > (this=0x272f000, h=0x2755870, who=0x77f94f "reset_timeout", now=1329216281) at /usr/include/c++/4.4/ostream:560
#6  endl<char, std::char_traits<char> > (this=0x272f000, h=0x2755870, who=0x77f94f "reset_timeout", now=1329216281) at /usr/include/c++/4.4/ostream:539
#7  operator<< (this=0x272f000, h=0x2755870, who=0x77f94f "reset_timeout", now=1329216281) at /usr/include/c++/4.4/ostream:113
#8  ceph::HeartbeatMap::_check (this=0x272f000, h=0x2755870, who=0x77f94f "reset_timeout", now=1329216281) at common/HeartbeatMap.cc:71
#9  0x00000000006e130b in ceph::HeartbeatMap::reset_timeout (this=0x272f000, h=0x2755870, grace=4, suicide_grace=0) at common/HeartbeatMap.cc:88
#10 0x000000000061702b in ThreadPool::worker (this=0x2745540) at common/WorkQueue.cc:70
#11 0x000000000057a4ad in ThreadPool::WorkThread::entry() ()
#12 0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#13 0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#14 0x0000000000000000 in ?? ()

Thread 42 (Thread 5318):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00000000005c1e5a in Wait (this=0x271b680) at ./common/Cond.h:48
#2  SimpleMessenger::wait (this=0x271b680) at msg/SimpleMessenger.cc:2646
#3  0x00000000004a4e9a in main (argc=<value optimized out>, argv=<value optimized out>) at ceph_osd.cc:404

Thread 41 (Thread 5323):
#0  0x00007fb82d302203 in __poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:87
#1  0x00000000006d222e in AdminSocket::entry (this=0x272e000) at common/admin_socket.cc:211
#2  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#3  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#4  0x0000000000000000 in ?? ()

Thread 40 (Thread 5355):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00000000005c1bf2 in Wait (this=0x271ad80) at ./common/Cond.h:48
#2  SimpleMessenger::reaper_entry (this=0x271ad80) at msg/SimpleMessenger.cc:2296
#3  0x00000000004a62cc in SimpleMessenger::ReaperThread::entry (this=0x271b180) at msg/SimpleMessenger.h:481
#4  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#5  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6  0x0000000000000000 in ?? ()

Thread 39 (Thread 5356):
#0  0x00007fb82d302203 in __poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:87
#1  0x00000000005d1464 in SimpleMessenger::Accepter::entry (this=0x271bb40) at msg/SimpleMessenger.cc:209
#2  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#3  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#4  0x0000000000000000 in ?? ()

Thread 38 (Thread 5357):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00000000005c1bf2 in Wait (this=0x271bb00) at ./common/Cond.h:48
#2  SimpleMessenger::reaper_entry (this=0x271bb00) at msg/SimpleMessenger.cc:2296
#3  0x00000000004a62cc in SimpleMessenger::ReaperThread::entry (this=0x271bf00) at msg/SimpleMessenger.h:481
#4  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#5  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6  0x0000000000000000 in ?? ()

Thread 37 (Thread 5361):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x0000000000613fd9 in Wait (this=0x2746458) at common/Cond.h:48
#2  SafeTimer::timer_thread (this=0x2746458) at common/Timer.cc:108
#3  0x000000000061691d in SafeTimerThread::entry() ()
#4  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#5  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6  0x0000000000000000 in ?? ()

Thread 36 (Thread 5358):
#0  0x00007fb82d302203 in __poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:87
#1  0x00000000005d1464 in SimpleMessenger::Accepter::entry (this=0x271b240) at msg/SimpleMessenger.cc:209
#2  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#3  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#4  0x0000000000000000 in ?? ()

Thread 35 (Thread 5359):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00000000005c1bf2 in Wait (this=0x271b200) at ./common/Cond.h:48
#2  SimpleMessenger::reaper_entry (this=0x271b200) at msg/SimpleMessenger.cc:2296
#3  0x00000000004a62cc in SimpleMessenger::ReaperThread::entry (this=0x271b600) at msg/SimpleMessenger.h:481
#4  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#5  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6  0x0000000000000000 in ?? ()

Thread 34 (Thread 5366):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00000000006f4337 in Wait (this=0x2749000) at ./common/Cond.h:48
#2  FileJournal::write_finish_thread_entry (this=0x2749000) at os/FileJournal.cc:1245
#3  0x000000000057a4ed in FileJournal::WriteFinisher::entry() ()
#4  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
---Type <return> to continue, or q <return> to quit---
#5  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6  0x0000000000000000 in ?? ()

Thread 33 (Thread 5377):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00000000005ba7e2 in Wait (this=0x271bb00) at ./common/Cond.h:48
#2  SimpleMessenger::dispatch_entry (this=0x271bb00) at msg/SimpleMessenger.cc:374
#3  0x00000000004a641c in SimpleMessenger::DispatchThread::entry (this=0x271bf58) at msg/SimpleMessenger.h:530
#4  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#5  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6  0x0000000000000000 in ?? ()

Thread 32 (Thread 6999):
#0  0x00007fb82d302203 in __poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=900000) at ../sysdeps/unix/sysv/linux/poll.c:87
#1  0x00000000005b81e1 in tcp_read_wait (sd=<value optimized out>, timeout=<value optimized out>) at msg/tcp.cc:53
#2  0x00000000005c29b9 in tcp_read (cct=0x2721000, sd=18, buf=0x7fb81f7f2d8f "?\020", len=1, timeout=0) at msg/tcp.cc:26
#3  0x00000000005cf2de in SimpleMessenger::Pipe::reader (this=0x27d4c80) at msg/SimpleMessenger.cc:1566
#4  0x00000000004a646d in SimpleMessenger::Pipe::Reader::entry (this=<value optimized out>) at msg/SimpleMessenger.h:196
#5  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#6  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#7  0x0000000000000000 in ?? ()

Thread 31 (Thread 7002):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00000000005c7556 in Wait (this=0x27d4c80) at ./common/Cond.h:48
#2  SimpleMessenger::Pipe::writer (this=0x27d4c80) at msg/SimpleMessenger.cc:1781
#3  0x00000000004a648d in SimpleMessenger::Pipe::Writer::entry (this=<value optimized out>) at msg/SimpleMessenger.h:204
#4  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#5  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6  0x0000000000000000 in ?? ()

Thread 30 (Thread 5418):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007fb82ec85864 in _L_lock_1024 () from /lib/libpthread.so.0
#2  0x00007fb82ec856c7 in __pthread_mutex_lock (mutex=0x272cd78) at pthread_mutex_lock.c:82
#3  0x000000000061e764 in CephContext::dout_lock (this=<value optimized out>, locker=0x80) at common/ceph_context.cc:190
#4  0x00000000005c098f in SimpleMessenger::Pipe::fault (this=0x280e780, onconnect=false, onread=<value optimized out>) at msg/SimpleMessenger.cc:1466
#5  0x00000000005cf458 in SimpleMessenger::Pipe::reader (this=0x280e780) at msg/SimpleMessenger.cc:1655
#6  0x00000000004a646d in SimpleMessenger::Pipe::Reader::entry (this=<value optimized out>) at msg/SimpleMessenger.h:196
#7  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#8  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#9  0x0000000000000000 in ?? ()

Thread 29 (Thread 7558):
#0  0x00007fb82d302203 in __poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=900000) at ../sysdeps/unix/sysv/linux/poll.c:87
#1  0x00000000005b81e1 in tcp_read_wait (sd=<value optimized out>, timeout=<value optimized out>) at msg/tcp.cc:53
#2  0x00000000005c29b9 in tcp_read (cct=0x2721000, sd=22, buf=0x7fb81bee3d8f "?\023", len=1, timeout=0) at msg/tcp.cc:26
#3  0x00000000005cf2de in SimpleMessenger::Pipe::reader (this=0x2806a00) at msg/SimpleMessenger.cc:1566
#4  0x00000000004a646d in SimpleMessenger::Pipe::Reader::entry (this=<value optimized out>) at msg/SimpleMessenger.h:196
#5  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#6  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#7  0x0000000000000000 in ?? ()

Thread 28 (Thread 5414):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007fb82ec85864 in _L_lock_1024 () from /lib/libpthread.so.0
#2  0x00007fb82ec856c7 in __pthread_mutex_lock (mutex=0x272cd78) at pthread_mutex_lock.c:82
#3  0x000000000061e764 in CephContext::dout_lock (this=<value optimized out>, locker=0x80) at common/ceph_context.cc:190
#4  0x00000000005c0aae in SimpleMessenger::Pipe::fault (this=0x275f280, onconnect=<value optimized out>, onread=<value optimized out>) at msg/SimpleMessenger.cc:1475
#5  0x00000000005cf458 in SimpleMessenger::Pipe::reader (this=0x275f280) at msg/SimpleMessenger.cc:1655
#6  0x00000000004a646d in SimpleMessenger::Pipe::Reader::entry (this=<value optimized out>) at msg/SimpleMessenger.h:196
#7  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#8  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#9  0x0000000000000000 in ?? ()

Thread 27 (Thread 5374):
#0  0x000000000066a2d5 in operator= (this=0x2745148) at ./common/LogEntry.h:54
#1  LogClient::_get_mon_log_message (this=0x2745148) at common/LogClient.cc:162
#2  0x000000000066ada9 in LogClient::get_mon_log_message (this=0x2745148) at common/LogClient.cc:140
#3  0x00000000005a9d4e in MonClient::handle_auth (this=0x7fff6d2e6160, m=<value optimized out>) at mon/MonClient.cc:489
#4  0x00000000005ae141 in MonClient::ms_dispatch (this=0x7fff6d2e6160, m=0x8e72d200) at mon/MonClient.cc:309
#5  0x00000000005baf5a in ms_deliver_dispatch (this=0x271b680) at msg/Messenger.h:103
#6  SimpleMessenger::dispatch_entry (this=0x271b680) at msg/SimpleMessenger.cc:364
#7  0x00000000004a641c in SimpleMessenger::DispatchThread::entry (this=0x271bad8) at msg/SimpleMessenger.h:530
#8  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#9  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#10 0x0000000000000000 in ?? ()

Thread 26 (Thread 5412):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007fb82ec8d71e in _L_cond_lock_1028 () from /lib/libpthread.so.0
#2  0x00007fb82ec8d54b in __pthread_mutex_cond_lock (mutex=0x27458c8) at ../nptl/pthread_mutex_lock.c:61
#3  0x00007fb82ec87ec3 in pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:299
#4  0x0000000000537cbc in WaitUntil (this=0x2745000) at ./common/Cond.h:67
#5  WaitInterval (this=0x2745000) at ./common/Cond.h:74
#6  OSD::heartbeat_entry (this=0x2745000) at osd/OSD.cc:1670
#7  0x000000000058c07d in OSD::T_Heartbeat::entry() ()
---Type <return> to continue, or q <return> to quit---
#8  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#9  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#10 0x0000000000000000 in ?? ()

Thread 25 (Thread 5411):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007fb82ec85864 in _L_lock_1024 () from /lib/libpthread.so.0
#2  0x00007fb82ec856c7 in __pthread_mutex_lock (mutex=0x272cd78) at pthread_mutex_lock.c:82
#3  0x000000000061e764 in CephContext::dout_lock (this=<value optimized out>, locker=0x80) at common/ceph_context.cc:190
#4  0x00000000006e07e9 in ceph::HeartbeatMap::_check (this=0x272f000, h=0x27558d0, who=0x77f94f "reset_timeout", now=-1) at common/HeartbeatMap.cc:70
#5  0x00000000006e130b in ceph::HeartbeatMap::reset_timeout (this=0x272f000, h=0x27558d0, grace=4, suicide_grace=0) at common/HeartbeatMap.cc:88
#6  0x000000000061702b in ThreadPool::worker (this=0x2745790) at common/WorkQueue.cc:70
#7  0x000000000057a4ad in ThreadPool::WorkThread::entry() ()
#8  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#9  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#10 0x0000000000000000 in ?? ()

Thread 24 (Thread 5410):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007fb82ec85864 in _L_lock_1024 () from /lib/libpthread.so.0
#2  0x00007fb82ec856c7 in __pthread_mutex_lock (mutex=0x272cd78) at pthread_mutex_lock.c:82
#3  0x000000000061e764 in CephContext::dout_lock (this=<value optimized out>, locker=0x80) at common/ceph_context.cc:190
#4  0x00000000006e07e9 in ceph::HeartbeatMap::_check (this=0x272f000, h=0x27558a0, who=0x77f94f "reset_timeout", now=-1) at common/HeartbeatMap.cc:70
#5  0x00000000006e130b in ceph::HeartbeatMap::reset_timeout (this=0x272f000, h=0x27558a0, grace=4, suicide_grace=0) at common/HeartbeatMap.cc:88
#6  0x000000000061702b in ThreadPool::worker (this=0x2745668) at common/WorkQueue.cc:70
#7  0x000000000057a4ad in ThreadPool::WorkThread::entry() ()
#8  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#9  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#10 0x0000000000000000 in ?? ()

Thread 23 (Thread 5408):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007fb82ec85849 in _L_lock_953 () from /lib/libpthread.so.0
#2  0x00007fb82ec8566b in __pthread_mutex_lock (mutex=0x2745020) at pthread_mutex_lock.c:61
#3  0x0000000000527925 in Mutex::Lock (this=0x2745010, no_lockdep=<value optimized out>) at ./common/Mutex.h:108
#4  0x00000000005484c6 in OSD::dequeue_op (this=0x2745000, pg=0x80) at osd/OSD.cc:5697
#5  0x0000000000617538 in ThreadPool::worker (this=0x2745418) at common/WorkQueue.cc:54
#6  0x000000000057a4ad in ThreadPool::WorkThread::entry() ()
#7  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#8  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#9  0x0000000000000000 in ?? ()

Thread 22 (Thread 5407):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007fb82ec85849 in _L_lock_953 () from /lib/libpthread.so.0
#2  0x00007fb82ec8566b in __pthread_mutex_lock (mutex=0x2745020) at pthread_mutex_lock.c:61
#3  0x0000000000527925 in Mutex::Lock (this=0x2745010, no_lockdep=<value optimized out>) at ./common/Mutex.h:108
#4  0x00000000005484c6 in OSD::dequeue_op (this=0x2745000, pg=0x80) at osd/OSD.cc:5697
#5  0x0000000000617538 in ThreadPool::worker (this=0x2745418) at common/WorkQueue.cc:54
#6  0x000000000057a4ad in ThreadPool::WorkThread::entry() ()
#7  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#8  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#9  0x0000000000000000 in ?? ()

Thread 21 (Thread 5378):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007fb82ec8d71e in _L_cond_lock_1028 () from /lib/libpthread.so.0
#2  0x00007fb82ec8d54b in __pthread_mutex_cond_lock (mutex=0x7fff6d2e62f8) at ../nptl/pthread_mutex_lock.c:61
#3  0x00007fb82ec87ec3 in pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:299
#4  0x0000000000614255 in WaitUntil (this=0x7fff6d2e6328) at common/Cond.h:67
#5  SafeTimer::timer_thread (this=0x7fff6d2e6328) at common/Timer.cc:110
#6  0x000000000061691d in SafeTimerThread::entry() ()
#7  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#8  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#9  0x0000000000000000 in ?? ()

Thread 20 (Thread 5376):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00000000005ba7e2 in Wait (this=0x271ad80) at ./common/Cond.h:48
#2  SimpleMessenger::dispatch_entry (this=0x271ad80) at msg/SimpleMessenger.cc:374
#3  0x00000000004a641c in SimpleMessenger::DispatchThread::entry (this=0x271b1d8) at msg/SimpleMessenger.h:530
#4  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#5  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6  0x0000000000000000 in ?? ()

Thread 19 (Thread 5373):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:212
#1  0x0000000000614255 in WaitUntil (this=0x2734688) at common/Cond.h:67
#2  SafeTimer::timer_thread (this=0x2734688) at common/Timer.cc:110
#3  0x000000000061691d in SafeTimerThread::entry() ()
#4  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#5  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6  0x0000000000000000 in ?? ()

Thread 18 (Thread 5371):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00000000005b5617 in Wait (this=0x2734810) at ./common/Cond.h:48
---Type <return> to continue, or q <return> to quit---
#2  Finisher::finisher_thread_entry (this=0x2734810) at common/Finisher.cc:76
#3  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#4  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#5  0x0000000000000000 in ?? ()

Thread 17 (Thread 5370):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x000000000071d275 in Wait (this=0x2734000) at ./common/Cond.h:48
#2  FileStore::flusher_entry (this=0x2734000) at os/FileStore.cc:2974
#3  0x00000000007256fd in FileStore::FlusherThread::entry() ()
#4  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#5  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6  0x0000000000000000 in ?? ()

Thread 16 (Thread 5369):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007fb82ec85864 in _L_lock_1024 () from /lib/libpthread.so.0
#2  0x00007fb82ec856c7 in __pthread_mutex_lock (mutex=0x272cd78) at pthread_mutex_lock.c:82
#3  0x000000000061e764 in CephContext::dout_lock (this=<value optimized out>, locker=0x80) at common/ceph_context.cc:190
#4  0x00000000006e07e9 in ceph::HeartbeatMap::_check (this=0x272f000, h=0x2733600, who=0x77f94f "reset_timeout", now=-1) at common/HeartbeatMap.cc:70
#5  0x00000000006e130b in ceph::HeartbeatMap::reset_timeout (this=0x272f000, h=0x2733600, grace=4, suicide_grace=0) at common/HeartbeatMap.cc:88
#6  0x000000000061702b in ThreadPool::worker (this=0x2734918) at common/WorkQueue.cc:70
#7  0x000000000057a4ad in ThreadPool::WorkThread::entry() ()
#8  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#9  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#10 0x0000000000000000 in ?? ()

Thread 15 (Thread 5368):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:212
#1  0x000000000061709b in WaitUntil (this=0x2734918) at common/Cond.h:67
#2  WaitInterval (this=0x2734918) at common/Cond.h:74
#3  ThreadPool::worker (this=0x2734918) at common/WorkQueue.cc:71
#4  0x000000000057a4ad in ThreadPool::WorkThread::entry() ()
#5  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#6  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#7  0x0000000000000000 in ?? ()

Thread 14 (Thread 5367):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00000000005b5617 in Wait (this=0x2734070) at ./common/Cond.h:48
#2  Finisher::finisher_thread_entry (this=0x2734070) at common/Finisher.cc:76
#3  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#4  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#5  0x0000000000000000 in ?? ()

Thread 13 (Thread 5360):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007fb82ec85849 in _L_lock_953 () from /lib/libpthread.so.0
#2  0x00007fb82ec8566b in __pthread_mutex_lock (mutex=0x7fff6d2e62f8) at pthread_mutex_lock.c:61
#3  0x0000000000527925 in Mutex::Lock (this=0x7fff6d2e62e8, no_lockdep=<value optimized out>) at ./common/Mutex.h:108
#4  0x0000000000528bb1 in Locker (this=0x2745000) at ./common/Mutex.h:135
#5  send_mon_message (this=0x2745000) at ./mon/MonClient.h:191
#6  OSD::send_failures (this=0x2745000) at osd/OSD.cc:2167
#7  0x00000000005523e5 in OSD::do_mon_report (this=0x2745000) at osd/OSD.cc:1897
#8  0x0000000000573771 in OSD::tick (this=0x2745000) at osd/OSD.cc:1798
#9  0x0000000000613f6b in SafeTimer::timer_thread (this=0x2745050) at common/Timer.cc:102
#10 0x000000000061691d in SafeTimerThread::entry() ()
#11 0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#12 0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#13 0x0000000000000000 in ?? ()

Thread 12 (Thread 5365):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00000000006f300c in Wait (this=0x2749000) at ./common/Cond.h:48
#2  FileJournal::write_thread_entry (this=0x2749000) at os/FileJournal.cc:1052
#3  0x000000000057a4cd in FileJournal::Writer::entry() ()
#4  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#5  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6  0x0000000000000000 in ?? ()

Thread 11 (Thread 5364):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:212
#1  0x000000000071d8ad in WaitUntil (this=0x2734000) at ./common/Cond.h:67
#2  WaitInterval (this=0x2734000) at ./common/Cond.h:74
#3  FileStore::sync_entry (this=0x2734000) at os/FileStore.cc:3014
#4  0x000000000072571d in FileStore::SyncThread::entry() ()
#5  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#6  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#7  0x0000000000000000 in ?? ()

Thread 10 (Thread 5372):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00000000005b5617 in Wait (this=0x27344c0) at ./common/Cond.h:48
#2  Finisher::finisher_thread_entry (this=0x27344c0) at common/Finisher.cc:76
#3  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#4  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#5  0x0000000000000000 in ?? ()
---Type <return> to continue, or q <return> to quit---

Thread 9 (Thread 5375):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00000000005ba7e2 in Wait (this=0x271b200) at ./common/Cond.h:48
#2  SimpleMessenger::dispatch_entry (this=0x271b200) at msg/SimpleMessenger.cc:374
#3  0x00000000004a641c in SimpleMessenger::DispatchThread::entry (this=0x271b658) at msg/SimpleMessenger.h:530
#4  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#5  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6  0x0000000000000000 in ?? ()

Thread 8 (Thread 5354):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00000000005c1bf2 in Wait (this=0x271b680) at ./common/Cond.h:48
#2  SimpleMessenger::reaper_entry (this=0x271b680) at msg/SimpleMessenger.cc:2296
#3  0x00000000004a62cc in SimpleMessenger::ReaperThread::entry (this=0x271ba80) at msg/SimpleMessenger.h:481
#4  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#5  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6  0x0000000000000000 in ?? ()

Thread 7 (Thread 5379):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00000000005b5617 in Wait (this=0x7fff6d2e63e0) at ./common/Cond.h:48
#2  Finisher::finisher_thread_entry (this=0x7fff6d2e63e0) at common/Finisher.cc:76
#3  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#4  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#5  0x0000000000000000 in ?? ()

Thread 6 (Thread 5415):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007fb82ec8d71e in _L_cond_lock_1028 () from /lib/libpthread.so.0
#2  0x00007fb82ec8d54b in __pthread_mutex_cond_lock (mutex=0x275f358) at ../nptl/pthread_mutex_lock.c:61
#3  0x00007fb82ec87b36 in pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:236
#4  0x00000000005c7556 in Wait (this=0x275f280) at ./common/Cond.h:48
#5  SimpleMessenger::Pipe::writer (this=0x275f280) at msg/SimpleMessenger.cc:1781
#6  0x00000000004a648d in SimpleMessenger::Pipe::Writer::entry (this=<value optimized out>) at msg/SimpleMessenger.h:204
#7  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#8  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#9  0x0000000000000000 in ?? ()

Thread 5 (Thread 5416):
#0  0x00007fb82d302203 in __poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=900000) at ../sysdeps/unix/sysv/linux/poll.c:87
#1  0x00000000005b81e1 in tcp_read_wait (sd=<value optimized out>, timeout=<value optimized out>) at msg/tcp.cc:53
#2  0x00000000005c29b9 in tcp_read (cct=0x2721000, sd=17, buf=0x7fb81c3e8cd0 "`1~D", len=9, timeout=0) at msg/tcp.cc:26
#3  0x00000000005c4aab in SimpleMessenger::Pipe::connect (this=0x275f000) at msg/SimpleMessenger.cc:1056
#4  0x00000000005c7fb9 in SimpleMessenger::Pipe::writer (this=0x275f000) at msg/SimpleMessenger.cc:1688
#5  0x00000000004a648d in SimpleMessenger::Pipe::Writer::entry (this=<value optimized out>) at msg/SimpleMessenger.h:204
#6  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#7  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#8  0x0000000000000000 in ?? ()

Thread 4 (Thread 5417):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007fb82ec8d71e in _L_cond_lock_1028 () from /lib/libpthread.so.0
#2  0x00007fb82ec8d54b in __pthread_mutex_cond_lock (mutex=0x280e858) at ../nptl/pthread_mutex_lock.c:61
#3  0x00007fb82ec87b36 in pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:236
#4  0x00000000005c7556 in Wait (this=0x280e780) at ./common/Cond.h:48
#5  SimpleMessenger::Pipe::writer (this=0x280e780) at msg/SimpleMessenger.cc:1781
#6  0x00000000004a648d in SimpleMessenger::Pipe::Writer::entry (this=<value optimized out>) at msg/SimpleMessenger.h:204
#7  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#8  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#9  0x0000000000000000 in ?? ()

Thread 3 (Thread 7561):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00000000005c7556 in Wait (this=0x2806a00) at ./common/Cond.h:48
#2  SimpleMessenger::Pipe::writer (this=0x2806a00) at msg/SimpleMessenger.cc:1781
#3  0x00000000004a648d in SimpleMessenger::Pipe::Writer::entry (this=<value optimized out>) at msg/SimpleMessenger.h:204
#4  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#5  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6  0x0000000000000000 in ?? ()

Thread 2 (Thread 5353):
#0  0x00007fb82d302203 in __poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:87
#1  0x00000000005d1464 in SimpleMessenger::Accepter::entry (this=0x271b6c0) at msg/SimpleMessenger.cc:209
#2  0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#3  0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#4  0x0000000000000000 in ?? ()

Thread 1 (Thread 5321):
#0  0x00007fb82ec8ba0b in raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42
#1  0x0000000000628d43 in reraise_fatal (signum=5318) at global/signal_handler.cc:59
#2  0x000000000062930c in handle_fatal_signal (signum=<value optimized out>) at global/signal_handler.cc:109
#3  <signal handler called>
#4  0x00007fb82d25bba5 in raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#5  0x00007fb82d25f6b0 in abort () at abort.c:92
#6  0x00007fb82daff6bd in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/libstdc++.so.6
#7  0x00007fb82dafd906 in ?? () from /usr/lib/libstdc++.so.6
---Type <return> to continue, or q <return> to quit---
#8  0x00007fb82dafd933 in std::terminate() () from /usr/lib/libstdc++.so.6
#9  0x00007fb82dafda3e in __cxa_throw () from /usr/lib/libstdc++.so.6
#10 0x00000000005b70b1 in ceph::__ceph_assert_fail (assertion=<value optimized out>, file=<value optimized out>, line=<value optimized out>, func=<value optimized out>) at common/assert.cc:75
#11 0x00000000006e07ae in ceph::HeartbeatMap::_check (this=0x7fb82dd35f60, h=0x2721010, who=0x77f8d9 "is_healthy", now=41029632) at common/HeartbeatMap.cc:78
#12 0x00000000006e0adf in ceph::HeartbeatMap::is_healthy (this=0x272f000) at common/HeartbeatMap.cc:118
#13 0x00000000006e0d10 in ceph::HeartbeatMap::check_touch_file (this=0x14c6) at common/HeartbeatMap.cc:129
#14 0x000000000061fcdf in CephContextServiceThread::entry (this=0x271f940) at common/ceph_context.cc:64
#15 0x00007fb82ec83971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#16 0x00007fb82d30e92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#17 0x0000000000000000 in ?? ()

on ubuntu@teuthology:/a/nightly_coverage_2012-02-14-a/11871

Actions #3

Updated by Sage Weil about 12 years ago

  • Priority changed from Normal to High
Actions #4

Updated by Sage Weil about 12 years ago

  • Priority changed from High to Normal
Actions #5

Updated by Sage Weil about 12 years ago

  • Status changed from New to Need More Info
Actions #6

Updated by Sage Weil about 12 years ago

  • Target version changed from v0.42 to v0.44
Actions #7

Updated by Sage Weil about 12 years ago

  • Status changed from Need More Info to Can't reproduce

haven't seen this in a while.

also, this code is about to go away anyway with wip-log.

Actions

Also available in: Atom PDF