https://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2010-11-01T16:33:48ZCeph Ceph - Bug #535: cephtool hangs forever until a UNIX signal is receivedhttps://tracker.ceph.com/issues/535?journal_id=13732010-11-01T16:33:48ZSage Weilsage@newdream.net
<ul><li><strong>Assignee</strong> set to <i>Greg Farnum</i></li></ul> Ceph - Bug #535: cephtool hangs forever until a UNIX signal is receivedhttps://tracker.ceph.com/issues/535?journal_id=13742010-11-01T16:42:14ZColin McCabecolinm@hq.newdream.net
<ul></ul><blockquote>
<p>Perhaps this bug is caused by Nagle's algorithm?</p>
</blockquote>
<p>As Sage pointed out, we're already running with TCP_NODELAY. So scratch that theory.</p> Ceph - Bug #535: cephtool hangs forever until a UNIX signal is receivedhttps://tracker.ceph.com/issues/535?journal_id=13762010-11-01T17:09:59ZColin McCabecolinm@hq.newdream.net
<ul></ul><p>While running vstart.sh, I reproduced this bug with debug_ms = 20.</p>
<p>Here's what the output was. Since cephtool doesn't redirect logs to a logfile, this is taken from the console.</p>
<pre>
./cmds -i a -c ceph.conf
** WARNING: Ceph is still under heavy development, and is only suitable for **
** testing and review. Do not trust it with important data. **
starting mds.a at 0.0.0.0:6804/13017
creating dev/mds.b.keyring
read 119 bytes from dev/mds.b.keyring
2010-11-01 16:54:15.511902 mon <- [auth,add,mds.b]
2010-11-01 16:54:18.224411 mon0 -> 'added key for mds.b' (0)
</pre> Ceph - Bug #535: cephtool hangs forever until a UNIX signal is receivedhttps://tracker.ceph.com/issues/535?journal_id=13772010-11-01T17:12:33ZColin McCabecolinm@hq.newdream.net
<ul></ul><p>Colin McCabe wrote:</p>
<blockquote>
<p>While running vstart.sh, I reproduced this bug with debug_ms = 20.</p>
<p>Here's what the output was. Since cephtool doesn't redirect logs to a logfile, this is taken from the console.</p>
<p>[...]</p>
</blockquote>
<p>And here is the backtrace that goes with this occurrence of the bug.</p>
<pre>
(gdb) thread apply all bt
Thread 4 (Thread 0x7f84e8e7d710 (LWP 13031)):
#0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:211
#1 0x00000000005f035f in Cond::WaitUntil (this=0x981390, mutex=@0x981310, when={tv = {tv_sec = 1288656798, tv_nsec = 515238000}})
at ./common/Cond.h:60
#2 0x0000000000686671 in Timer::timer_entry (this=0x9812a8) at common/Timer.cc:119
#3 0x00000000005bf8df in Timer::TimerThread::entry (this=0x9813c8) at ./common/Timer.h:77
#4 0x00000000005ea05e in Thread::_entry_func (arg=0x9813c8) at common/Thread.h:39
#5 0x00007f84f17678ba in start_thread (arg=<value optimized out>) at pthread_create.c:300
#6 0x00007f84eca6602d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#7 0x0000000000000000 in ?? ()
Current language: auto; currently asm
Thread 3 (Thread 0x7f84e847a710 (LWP 13116)):
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1 0x00000000005c8b6d in Cond::Wait (this=0x210ebb8, mutex=@0x210eab0) at ./common/Cond.h:46
#2 0x00000000005e4450 in SimpleMessenger::Pipe::writer (this=0x210e9e0) at msg/SimpleMessenger.cc:1680
#3 0x00000000005c6fd9 in SimpleMessenger::Pipe::Writer::entry (this=0x210ec30) at ./msg/SimpleMessenger.h:201
#4 0x00000000005ea05e in Thread::_entry_func (arg=0x210ec30) at common/Thread.h:39
#5 0x00007f84f17678ba in start_thread (arg=<value optimized out>) at pthread_create.c:300
#6 0x00007f84eca6602d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#7 0x0000000000000000 in ?? ()
Thread 2 (Thread 0x7f84e857b710 (LWP 13119)):
#0 0x00007f84eca5b113 in *__GI___poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=900000) at ../sysdeps/unix/sysv/linux/poll.c:87
#1 0x00000000005e0ce6 in tcp_read (sd=3, buf=0x7f84e857ada3 "?", len=1, timeout=900000) at msg/tcp.cc:18
#2 0x00000000005e68ac in SimpleMessenger::Pipe::reader (this=0x210e9e0) at msg/SimpleMessenger.cc:1473
#3 0x00000000005c6fb9 in SimpleMessenger::Pipe::Reader::entry (this=0x210ec18) at ./msg/SimpleMessenger.h:193
#4 0x00000000005ea05e in Thread::_entry_func (arg=0x210ec18) at common/Thread.h:39
#5 0x00007f84f17678ba in start_thread (arg=<value optimized out>) at pthread_create.c:300
#6 0x00007f84eca6602d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#7 0x0000000000000000 in ?? ()
Thread 1 (Thread 0x7f84f1d977c0 (LWP 13025)):
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1 0x00000000005c8b6d in Cond::Wait (this=0x2108888, mutex=@0x21085b8) at ./common/Cond.h:46
#2 0x00000000005df717 in SimpleMessenger::wait (this=0x2108430) at msg/SimpleMessenger.cc:2559
#3 0x0000000000596410 in main (argc=6, argv=0x2106ab0) at tools/ceph.cc:683
Current language: auto; currently c
162 in ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S
Current language: auto; currently asm
(gdb)
</pre> Ceph - Bug #535: cephtool hangs forever until a UNIX signal is receivedhttps://tracker.ceph.com/issues/535?journal_id=13982010-11-02T15:34:29ZGreg Farnumgfarnum@redhat.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>In Progress</i></li></ul><p>Okay, got in on a hang. The Pipe's been doing a disconnect/reconnect loop for about 4 minutes, it's currently in state 4; it was reliably calling mark_down on mon1 every 3 seconds and then reconnecting.<br />Got a copy of hte log, am taking a look.</p> Ceph - Bug #535: cephtool hangs forever until a UNIX signal is receivedhttps://tracker.ceph.com/issues/535?journal_id=14002010-11-03T14:11:07ZGreg Farnumgfarnum@redhat.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Rejected</i></li></ul><p>This occurrence is a problem on the monitor side that reproduces in the timer-fixes branch, but not unstable.</p> Ceph - Bug #535: cephtool hangs forever until a UNIX signal is receivedhttps://tracker.ceph.com/issues/535?journal_id=14072010-11-03T22:49:31ZColin McCabecolinm@hq.newdream.net
<ul></ul><p>I should have written this at the top of the bug report, but this was on the unstable branch.</p>
<p>Anyway, I'll add more information here when/if it reproduces again on unstable.</p> Ceph - Bug #535: cephtool hangs forever until a UNIX signal is receivedhttps://tracker.ceph.com/issues/535?journal_id=14662010-11-08T10:35:10ZColin McCabecolinm@hq.newdream.net
<ul><li><strong>Status</strong> changed from <i>Rejected</i> to <i>In Progress</i></li></ul><p>Reproduced again on the unfound branch, which is very close to what is in unstable now.</p>
<p>cmccabe@flab:~/src/ceph/src$ ps-ceph.rb <br />cmccabe 16972 0.0 0.0 10528 1376 pts/25 S+ 10:31 0:00 /bin/bash -x ./test/test_unfound.sh run<br />cmccabe 16983 0.0 0.0 11036 1676 pts/25 S+ 10:31 0:00 /bin/sh ./vstart.sh -d -n -o osd recovery delay start = 10000<br />cmccabe 17024 10.6 0.0 196352 4148 ? Ssl 10:31 0:17 ./cmon -i a -c ceph.conf<br />cmccabe 17028 3.7 0.0 132908 4172 ? Ssl 10:31 0:06 ./cmon -i b -c ceph.conf<br />cmccabe 17032 3.1 0.0 130880 4080 ? Ssl 10:31 0:05 ./cmon -i c -c ceph.conf<br />cmccabe 17130 0.4 0.1 227556 13400 ? Ssl 10:31 0:00 ./cosd -i 0 -c ceph.conf<br />cmccabe 17174 0.6 0.1 227556 13180 ? Ssl 10:31 0:01 ./cosd -i 1 -c ceph.conf<br />cmccabe 17181 0.0 0.1 226556 9712 pts/25 Sl+ 10:31 0:00 ./ceph -c ceph.conf -i dev/mds.a.keyring auth add mds.a</p>
<p>(gdb) thread apply all bt</p>
<p>Thread 5 (Thread 0x7f9029fd9710 (LWP 17183)):<br />#0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:211<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00000000005f0417 in Cond::WaitUntil (this=0x9813d8, mutex=@0x981358, when={tv = {tv_sec = 1289241398, tv_nsec = 803867000}}) at ./common/Cond.h:60<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x0000000000686955 in Timer::timer_entry (this=0x9812f0) at common/Timer.cc:119<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00000000005bf96d in Timer::TimerThread::entry (this=0x981410) at ./common/Timer.h:77<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00000000005ea116 in Thread::_entry_func (arg=0x981410) at common/Thread.h:39<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: ./rados lspools sometimes hangs after listing all pools? (Closed)" href="https://tracker.ceph.com/issues/5">#5</a> 0x00007f90328c38ba in start_thread (arg=<value optimized out>) at pthread_create.c:300<br /><a class="issue tracker-2 status-6 priority-3 priority-lowest closed" title="Feature: libceph could use a backward-compatible-to function (Rejected)" href="https://tracker.ceph.com/issues/6">#6</a> 0x00007f902dbc202d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112<br /><a class="issue tracker-6 status-3 priority-3 priority-lowest closed" title="Documentation: Document Monitor Commands (Resolved)" href="https://tracker.ceph.com/issues/7">#7</a> 0x0000000000000000 in ?? ()<br />Current language: auto; currently asm</p>
<p>Thread 4 (Thread 0x7f90297d8710 (LWP 17186)):<br />#0 0x00007f902dbb7113 in *<i>GI</i>_poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=900000) at ../sysdeps/unix/sysv/linux/poll.c:87<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00000000005e0d9e in tcp_read (sd=3, buf=0x7f90297d7da3 "?", len=1, timeout=900000) at msg/tcp.cc:18<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00000000005e6964 in SimpleMessenger::Pipe::reader (this=0x214e260) at msg/SimpleMessenger.cc:1473<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00000000005c7047 in SimpleMessenger::Pipe::Reader::entry (this=0x214e498) at ./msg/SimpleMessenger.h:193<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00000000005ea116 in Thread::_entry_func (arg=0x214e498) at common/Thread.h:39<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: ./rados lspools sometimes hangs after listing all pools? (Closed)" href="https://tracker.ceph.com/issues/5">#5</a> 0x00007f90328c38ba in start_thread (arg=<value optimized out>) at pthread_create.c:300<br /><a class="issue tracker-2 status-6 priority-3 priority-lowest closed" title="Feature: libceph could use a backward-compatible-to function (Rejected)" href="https://tracker.ceph.com/issues/6">#6</a> 0x00007f902dbc202d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112<br /><a class="issue tracker-6 status-3 priority-3 priority-lowest closed" title="Documentation: Document Monitor Commands (Resolved)" href="https://tracker.ceph.com/issues/7">#7</a> 0x0000000000000000 in ?? ()</p>
<p>Thread 3 (Thread 0x7f90295d6710 (LWP 17237)):<br />#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00000000005c8c17 in Cond::Wait (this=0x2150748, mutex=@0x2150640) at ./common/Cond.h:46<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00000000005e4508 in SimpleMessenger::Pipe::writer (this=0x2150570) at msg/SimpleMessenger.cc:1680<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00000000005c7067 in SimpleMessenger::Pipe::Writer::entry (this=0x21507c0) at ./msg/SimpleMessenger.h:201<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00000000005ea116 in Thread::_entry_func (arg=0x21507c0) at common/Thread.h:39<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: ./rados lspools sometimes hangs after listing all pools? (Closed)" href="https://tracker.ceph.com/issues/5">#5</a> 0x00007f90328c38ba in start_thread (arg=<value optimized out>) at pthread_create.c:300<br /><a class="issue tracker-2 status-6 priority-3 priority-lowest closed" title="Feature: libceph could use a backward-compatible-to function (Rejected)" href="https://tracker.ceph.com/issues/6">#6</a> 0x00007f902dbc202d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112<br /><a class="issue tracker-6 status-3 priority-3 priority-lowest closed" title="Documentation: Document Monitor Commands (Resolved)" href="https://tracker.ceph.com/issues/7">#7</a> 0x0000000000000000 in ?? ()<br />Current language: auto; currently c</p>
<p>Thread 2 (Thread 0x7f90296d7710 (LWP 17239)):<br />#0 0x00007f902dbb7113 in *<i>GI</i>_poll (fds=<value optimized out>, nfds=<value optimized out>, timeout=900000) at ../sysdeps/unix/sysv/linux/poll.c:87<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00000000005e0d9e in tcp_read (sd=3, buf=0x7f90296d6da3 "?", len=1, timeout=900000) at msg/tcp.cc:18<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00000000005e6964 in SimpleMessenger::Pipe::reader (this=0x2150570) at msg/SimpleMessenger.cc:1473<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x00000000005c7047 in SimpleMessenger::Pipe::Reader::entry (this=0x21507a8) at ./msg/SimpleMessenger.h:193<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: lockdep warning in socket code (Closed)" href="https://tracker.ceph.com/issues/4">#4</a> 0x00000000005ea116 in Thread::_entry_func (arg=0x21507a8) at common/Thread.h:39<br /><a class="issue tracker-1 status-5 priority-3 priority-lowest closed" title="Bug: ./rados lspools sometimes hangs after listing all pools? (Closed)" href="https://tracker.ceph.com/issues/5">#5</a> 0x00007f90328c38ba in start_thread (arg=<value optimized out>) at pthread_create.c:300<br /><a class="issue tracker-2 status-6 priority-3 priority-lowest closed" title="Feature: libceph could use a backward-compatible-to function (Rejected)" href="https://tracker.ceph.com/issues/6">#6</a> 0x00007f902dbc202d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112<br /><a class="issue tracker-6 status-3 priority-3 priority-lowest closed" title="Documentation: Document Monitor Commands (Resolved)" href="https://tracker.ceph.com/issues/7">#7</a> 0x0000000000000000 in ?? ()<br />Current language: auto; currently asm</p>
<p>Thread 1 (Thread 0x7f9032ef37c0 (LWP 17181)):<br />#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a> 0x00000000005c8c17 in Cond::Wait (this=0x2148888, mutex=@0x21485b8) at ./common/Cond.h:46<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a> 0x00000000005df7cf in SimpleMessenger::wait (this=0x2148430) at msg/SimpleMessenger.cc:2559<br /><a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: leaked dentry ref on umount (Closed)" href="https://tracker.ceph.com/issues/3">#3</a> 0x0000000000596410 in main (argc=6, argv=0x2146ad0) at tools/ceph.cc:683<br />Current language: auto; currently c<br />162 in ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S<br />Current language: auto; currently asm<br />(gdb)</p> Ceph - Bug #535: cephtool hangs forever until a UNIX signal is receivedhttps://tracker.ceph.com/issues/535?journal_id=14672010-11-08T10:35:55ZColin McCabecolinm@hq.newdream.net
<ul></ul><p>The process that is hung is 17181, cephtool.</p> Ceph - Bug #535: cephtool hangs forever until a UNIX signal is receivedhttps://tracker.ceph.com/issues/535?journal_id=14692010-11-08T13:05:14ZGreg Farnumgfarnum@redhat.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Can't reproduce</i></li></ul><p>Look, I know it's a pain, but work on this isn't going to progress unless we collect AT LEAST:<br />1) The state of each messenger Pipe,<br />2) All logs in the system, the more detailed the better.<br />My guess is we're not going to get anywhere on this unless we manage to observe it with full messenger logging on the ceph tool, but maybe we'll get lucky from just the daemon logs and the ceph tool state. Preferably I get the chance to look at it in gdb so I can follow up on any weirdness exposed by the logs and the state.</p>
<p>A backtrace of the threads is no use at all; that looks the same whether the program is behaving or not.</p> Ceph - Bug #535: cephtool hangs forever until a UNIX signal is receivedhttps://tracker.ceph.com/issues/535?journal_id=15882010-11-08T22:49:51ZColin McCabecolinm@hq.newdream.net
<ul><li><strong>Source</strong> set to <i>0</i></li></ul><blockquote>
<p>Look, I know it's a pain, but work on this isn't going to progress unless <br />we collect AT LEAST:<br />1) The state of each messenger Pipe,<br />2) All logs in the system, the more detailed the better.</p>
</blockquote>
<p>I can definitely get <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: BUG at fs/ceph/caps.c:2178 (Closed)" href="https://tracker.ceph.com/issues/2">#2</a>. Will running with debug_ms = 20 be sufficient to get <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: gpf in tcp_sendpage (Closed)" href="https://tracker.ceph.com/issues/1">#1</a>?</p>
<blockquote>
<p>My guess is we're not going to get anywhere on this unless we manage to <br />observe it with full messenger logging on the ceph tool, but maybe we'll <br />get lucky from just the daemon logs and the ceph tool state. Preferably I <br />get the chance to look at it in gdb so I can follow up on any weirdness <br />exposed by the logs and the state.</p>
</blockquote>
<p>You'll get one chance with gdb. Then when you detach it will start working again!</p>
<p>That's the way this bug goes.</p>
<p>I think that if you were developing on flab, you would not think "can't reproduce" was the right state for this bug. I see it perhaps 3, 4 times a day and it's highly distracting, especially when it crops up during an automated test. Perhaps we should exchange dev machines or something? Just an idea.</p> Ceph - Bug #535: cephtool hangs forever until a UNIX signal is receivedhttps://tracker.ceph.com/issues/535?journal_id=16032010-11-09T15:08:19ZColin McCabecolinm@hq.newdream.net
<ul><li><strong>File</strong> <a href="/attachments/download/107/messenger-bug.txt">messenger-bug.txt</a> <a class="icon-only icon-magnifier" title="View" href="/attachments/107/messenger-bug.txt">View</a> added</li></ul><p>messenger-bug.txt</p> Ceph - Bug #535: cephtool hangs forever until a UNIX signal is receivedhttps://tracker.ceph.com/issues/535?journal_id=16082010-11-09T16:28:52ZColin McCabecolinm@hq.newdream.net
<ul><li><strong>File</strong> <a href="/attachments/download/108/cephtool-hang-at-966369aad07461f2610b4dd2a9cdc770155c5a89.txt">cephtool-hang-at-966369aad07461f2610b4dd2a9cdc770155c5a89.txt</a> <a class="icon-only icon-magnifier" title="View" href="/attachments/108/cephtool-hang-at-966369aad07461f2610b4dd2a9cdc770155c5a89.txt">View</a> added</li></ul><p>cephtool-hang-at-966369aad07461f2610b4dd2a9cdc770155c5a89.txt</p> Ceph - Bug #535: cephtool hangs forever until a UNIX signal is receivedhttps://tracker.ceph.com/issues/535?journal_id=16212010-11-10T18:16:41ZGreg Farnumgfarnum@redhat.com
<ul><li><strong>Status</strong> changed from <i>Can't reproduce</i> to <i>In Progress</i></li></ul><p>After checking the logs and conferring with Sage, I think I've found a possible cause. Designing and testing a fix now. Once I've made sure it doesn't break anything I'm going to need to have Colin run it before pushing, though, since he's the one who kept seeing this bug.</p> Ceph - Bug #535: cephtool hangs forever until a UNIX signal is receivedhttps://tracker.ceph.com/issues/535?journal_id=16442010-11-12T11:02:43ZGreg Farnumgfarnum@redhat.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>7</i></li></ul><p>Pushed a potential fix to the msgr branch, waiting for Colin to report back on if it works or not. :)</p> Ceph - Bug #535: cephtool hangs forever until a UNIX signal is receivedhttps://tracker.ceph.com/issues/535?journal_id=16522010-11-13T20:39:21ZColin McCabecolinm@hq.newdream.net
<ul></ul><p>It looks good so far.</p> Ceph - Bug #535: cephtool hangs forever until a UNIX signal is receivedhttps://tracker.ceph.com/issues/535?journal_id=16622010-11-15T13:49:29ZGreg Farnumgfarnum@redhat.com
<ul><li><strong>Status</strong> changed from <i>7</i> to <i>Resolved</i></li></ul><p>Sage spent some time on the messenger too, and I suspect we're done now.</p>